Iterator Design Pattern

We have talked the usage of iterator when we need to interact with collections:

Not expose internal state to outside world: Increasing cohesion of class and avoid mis-operation;
A unified interface to operate on different kinds of operation;
Enforce the order of method call;

Iterator Pitfall

public interface Iterator<E> {  
  boolean hasNext();  
  E next(); 
  default void remove() {  
    throw new UnsupportedOperationException("remove");  
  }
  default void forEachRemaining(Consumer<? super E> action) {  
    Objects.requireNonNull(action);  
    while (hasNext())  
      action.accept(next());  
  }
}

Checkout the interface of iterator (two more default method added in Java8), and we can see the common usage routine of Iterator.

    while (hasNext())  
      T t = next(); 
      // do with t

But it is not so friendly for newbie. We may call the next() directly without checking whether hasNext() and surprised by a horrible NoSuchElementException.

And in order to make it, the collection coder have to do many defensive and duplicative coding. They have to consider: What if the user doesn’t call hasNext() before next()? What if it calls hasNext() more than once?

Additionally, the two-method protocol generally requires the statefulness of Iterator, such as peeking ahead one element (and keeping track of whether you’ve already peeked ahead), then reverting back in some cases.

New Iterator

In order to solve the problem, Java8 introduced a new iterator called Spliterator. Although Spliterator is also a basic Iterator, it doesn’t extend Iterator, instead taking a different approach to element access.

As we have said, an Iterator has two methods to access element, hasNext() and next(); But in Java8, we have lambdas in the language enables Spliterator to take an approach to element access that’s generally more efficient — and easier to code correctly. Spliterator has two methods for accessing elements:

boolean tryAdvance(Consumer<? super T> action);
void forEachRemaining(Consumer<? super T> action);

With the help of Consumer, we can iterate the collection item and just return boolean to indicate whether we have more elements. We make it in one method call, more elegant and easier to do it right.

Besides a better design which eliminates the possible error Iterator may bring, Spliterator also born with ability to do things more efficiently. As its name indicates, it can split a iterator into two:

Spliterator<T> trySplit();

Why it can be more efficient to do element access using Spliterator? Because we can do it in parallel by splitting large collection into multiple small piece, which is used in parallelStream().

`Spliterator` Examples

An ArrayList with more than one element can always be split evenly and easily because it can split using middle of index without changing the underlying array. Similarly, HashSet split collection based on the underlying Node[] and index.

int hi = getFence(), lo = index, mid = (lo + hi) >>> 1;  
return (lo >= mid) ? null : // divide range in half unless too small  
  new ArrayListSpliterator<E>(list, lo, index = mid,  
                                expectedModCount);

But, a LinkedList always splits poorly, because it will copy the element into an array to split, in which way, it can accelerate further split as ArrayList does.

int n = batch + BATCH_UNIT;  
if (n > s)  
    n = s;  
if (n > MAX_BATCH)  
    n = MAX_BATCH;  
Object[] a = new Object[n];  
int j = 0;  
do { a[j++] = p.item; } while ((p = p.next) != null && j < n);  
current = p;  
batch = j;  
est = s - j;  
return Spliterators.spliterator(a, 0, j, Spliterator.ORDERED);

Things goes a little different in TreeSet: because it is implemented using Red-Black Tree, it is always split by tree’s root. Thanks to the balance of Red-Black Tree, it can always done evenly.

This is some code examples to test.

Ref

Java Linkedlist Spliterator

Written with StackEdit.

On teh way

Blog Search