Java源码解析CopyOnWriteArrayList的讲解-APISpace

Java源码解析CopyOnWriteArrayList的讲解

本文基于jdk1.8进行分析。

ArrayList和HashMap是我们经常使用的集合，它们不是线程安全的。我们一般都知道HashMap的线程安全版本为ConcurrentHashMap，那么ArrayList有没有类似的线程安全的版本呢？还真有，它就是CopyOnWriteArrayList。

CopyOnWrite这个短语，还有一个专门的称谓COW. COW不仅仅是java实现集合框架时专用的机制，它在计算机中被广泛使用。

首先看一下什么是CopyOnWriteArrayList，它的类前面的javadoc注释很长，我们只截取最前面的一小段。如下。它的介绍中说到，CopyOnWriteArrayList是ArrayList的一个线程安全的变种，在CopyOnWriteArrayList中，所有改变操作（add，set等）都是通过给array做一个新的拷贝来实现的。通常来看，这花费的代价太大了，但是，当读取list的线程数量远远多于写list的线程数量时，这种方法依然比别的实现方式更高效。

/**

* A thread-safe variant of {@link java.util.ArrayList} in which all mutative

* operations ({@code add}, {@code set}, and so on) are implemented by

* making a fresh copy of the underlying array.

This is ordinarily too costly, but may be more efficient

* than alternatives when traversal operations vastly outnumber

* mutations, and is useful when you cannot or don't want to

* synchronize traversals, yet need to preclude interference among

* concurrent threads. The "snapshot" style iterator method uses a

* reference to the state of the array at the point that the iterator

* was created. This array never changes during the lifetime of http://the

* iterator, so interference is impossible and the iterator is

* guaranteed not to throw {@code ConcurrentModificationException}.

* The iterator will not reflect additions, removals, or changes to

* the list since the iterator was created. Element-changing

* operations on iterators themselves ({@code remove}, {@code set}, and

* {@code add}) are not supported. These methods throw

* {@code UnsupportedOperationException}.

**/

下面看一下成员变量。只有2个，一个是基本数据结构array，用于保存数据，一个是可重入锁，它用于写操作的同步。

/** The lock protecting all mutators **/

final transient ReentrantLock lock = new ReentrantLock();

/** The array, accessed only via getArray/setArray. **/

private transient volatile Object[] array;

下面看一下主要方法。get方法如下。get方法没有什么特殊之处，不加锁，直接读取即可。

/**

* {@inheritDoc}

* @throws IndexOutOfBoundsException {@inheritDoc}

**/

public E get(int index) {

return get(getArray(), index);

}

/**

* Gets the array. Non-private so as to also be accessible

* from CopyOnWriteArraySet class.

**/

final Object[] getArray() {

return array;

}

@SuppressWarnings("unchecked")

private E get(Object[] a, int index) {

return (E) a[index];

}

下面看一下add。add方法先加锁，然后，把原array拷贝到一个新的数组中，并把待添加的元素加入到新数组，最后，再把新数组赋值给原数组。这里可以看到，add操作并不是直接在原数组上操作，而是把整个数据进行了拷贝，才操作的，最后把新数组赋值回去。

/**

* Appends the specified element to the end of this list.

* @param e element to be appended to this list

* @return {@code true} (as specified by {@link Collection#add})

**/

public boolean add(E e) {

final ReentrantLock lock = this.lock;

lock.lock();

try {

Object[] elements = getArray();

int len = elements.length;

Object[] newElements = Arrays.copyOf(elements, len + 1);

newElements[len] = e;

setArray(neoBYeECWXUwElements);

return true;

} finally {

lock.unlock();

}

/**

* Sets the array.

**/

final void setArray(Object[] a) {

array = a;

}

这里，思考一个问题。线程1正在遍历list，此时，线程2对线程进行了写入，那么，线程1可以遍历到线程2写入的数据吗？

首先明确一点，这个场景不会抛出任何异常，程序会安静的执行完成。是否能到读到线程2写入的数据，取决于遍历方式和线程2的写入时机及位置。

首先看遍历方式，我们2中方式遍历list，foreach和get(i)的方式。foreach的底层实现是迭代器，所以迭代器就不单独作为一种遍历方式了。首先看一下通过for循环get(i)的方式。这种遍历方式下，能否读取到线程2写入的数据，取决了线程2的写入时机和位置。如果线程1已经遍历到第5个元素了，那么如果线程2在第5个后面进行写入，那么线程1就可以读取到线程2的写入。

public class MyClass {

static List list = new CopyOnWriteArrayList<>();

public static void main(String[] args){

list.add("a");

list.add("b");

list.add("c");

list.add("d");

list.add("e");

list.add("f");

list.add("g");

list.add("h");

//启动线程1，遍历数据

new Thread(()->{

try{

for(int i = 0; i < list.size();i ++){

System.out.println(list.get(i));

Thread.sleep(1000);

}

}catch (Exception e){

e.printStackTrace();

}

}).start();

try{

//主线程作为线程2，等待2s

Thread.sleep(2000);

}catch (Exception e){

e.printStackTrace();

}

//主线程作为线程2，在位置4写入数据，即，在遍历位置之后写入数据

list.add(4,"n");

}

上述程序的运行结果如下，是可以遍历到n的。

如果线程2在第5个位置前面写入，那么线程1就读取oBYeECWXU不到线程2的写入。同时，还会带来一个副作用，就是某个元素会被读取2次。代码如下：

public class MyClass {

static List list = new CopyOnWriteArrayList<>();

public static void main(String[] args){

list.add("a");

list.add("b");

list.add("c");

list.add("d");

list.add("e");

list.add("f");

list.add("g");

list.add("h");

//启动线程1，遍历数据

new Thread(()->{

try{

for(int i = 0; i < list.size();i ++){

System.out.println(list.get(i));

Thread.sleep(1000);

}

}catch (Exception e){

e.printStackTrace();

}

}).start();

try{

//主线程作为线程2，等待2s

Thread.sleep(2000);

}catch (Exception e){

e.printStackTrace();

}

//主线程作为线程2，在位置1写入数据，即，在遍历位置之后写入数据

list.add(1,"n");

}

上述代码的运行结果如下，其中，b被遍历了2次。

那么，采用foreach方式遍历呢？答案是无论线程2写入时机如何，线程2都无法读取到线程2的写入。原因在于CopyOnWriteArrayList在创建迭代器时，取了当前时刻数组的快照。并且，add操作只会影响原数组，影响不到迭代器中的快照。

public Iterator iterator() {

return new COWIterator(getArray(), 0);

}

private COWIterator(Object[] elements, int initialCursor) {

cursor = initialCursor;

snapshot = elements;

}

了解清楚了遍历方式和写入时机对是否能够读取到写入的影响，我们在使用CopyOnWriteArrayList时就可以根据实际业务场景的需求，选择合适的实现方式了。

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，谢谢大家对我们的支持。如果你想了解更多相关内容请查看下面相关链接

c语言sscanf函数的用法是什么

227 2023-07-12

Java源码解析CopyOnWriteArrayList的讲解

c语言sscanf函数的用法是什么

php怎么获取input输入的值

r语言怎么删除数据表某一个数据

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）