“Copy-on-Iterate” Java Idiom considered broken
July 22nd, 2010 by Jevgeni KabanovThis is a story of an interesting support request handled by our guru Lauri Tulmin. The inquiry was about a rare occurring ArrayIndexOutOfBoundsException that JRebel seemed to cause in Wicket. After some investigation he discovered that the root exception was thrown from the following Wicket code:
private final Map<IModifiable, Entry> modifiableToEntry = new ConcurrentHashMap(); public void start(Duration pollFrequency) { this.task = new Task("ModificationWatcher"); this.task.run(pollFrequency, new ICode() { public void run(Logger log) { Iterator iter = new ArrayList( ModificationWatcher.this.modifiableToEntry.values()).iterator(); while (iter.hasNext())
Specifically the exception was thrown from the constructor of ArrayList. How was that possible?
Let’s take a step back and review the reasons for this code. One issue with collections is that when iterating a ConcurrentModificationException is thrown if a collection is changed (usually by a different thread). This is done to protect the Iterator from unpredictable behaviour. A common idiom to avoid it is to copy the collection before iteration like that:
for(Iterator i = new ArrayList(collection).iterator(); i.hasNext();) {...}
Note that the collection must either to be synchronized or otherwise suitable for a multi-threaded environment.
This idiom is a very common one, as easily proven by a simple Google Code search. In fact we used it several times in the JRebel code and is present in several places in Wicket as well. So how could this cause an ArrayIndexOutOfBoundsException?
When Lauri investigated in depth he found out that this idiom is inherently unsafe in a multi-threaded environment (even when the underlying collection is synchronized!). The reason for that is the way ArrayList was constructed before Java 1.6. In my 1.5 Java SDK source code it looks like this:
public ArrayList(Collection<? extends E> collection) { int size = collection.size(); firstIndex = 0; array = newElementArray(size + (size / 10)); collection.toArray(array); lastIndex = size; modCount = 1; }
The issue is that there is a race condition between the time the size is recorded and the collection.toArray(array) is called. During that time it is conceivable (and indeed possible) that the size of the collection is changed by a different thread. If the size is now bigger, the array copying in toArray() fails with the dreaded ArrayIndexOutOfBoundsException. When researching the issue further we found that the constructor in Java 1.6 ArrayList has been changed to avoid this issue, so at least on Oracle Java 1.6 or later you are safe. Unfortunately I couldn’t find a specific bug referring to that issue, so likely as not it’s an accidental fix.
So what should you do to avoid this issue? One way is to put the synchronized block around the whole loop, but this requires that you only access that collection from other threads in the same synchronized block. An easier solution is to use the toArray()method like this:
for (Iterator i = Arrays.asList(collection.toArray()).iterator(); i.hasNext();)

