Mar 21 2013

Incremental Arrayification

Simple questions can be fun. A friend in the Pittsburgh CocoaHeads said “Hey MarkD. We’re having a discussion at work on the right way to iterate through an NSArray. One dude said to just use the syntax, and the other said that we should always use the block based iteration form. What do you think?”

My gut reaction was “Prefer so you get fast iteration and prettier syntax. Use the block based iteration if you need the generality or special features.” (That’s also the TL;DR) And then I started pondering the different ways to iterate collections.

I ended up writing a little test app that lives at this gist, and the code here is taken from that test. A big NSArray is loaded up with unique NSNumbers. The array is walked, each object fetched, and a count incremented. The time consumed is pretty much entirely looping overhead. Here’s a typical loop:

static void UseFastEnumeration (NSArray *array) {
    NSUInteger count = 0;

    for (NSNumber *number in array) {
        if (number != nil) count++;

    assert (count == array.count);

} // UseFastEnumeration

On my 2010 MacBookPro running OS X 10.7.5 it took 50 million elements in the array for a loop to take more than one second. Any decisions based on performance should therefore be taken with a large grain of salt – the per-loop performance differences are negligible.

1. Object At Index

The easiest to understand iteration techniques is writing a standard C loop and calling objectAtIndex: each time through the loop:

    for (NSUInteger i = 0; i < array.count; i++) {
        NSNumber *number = [array objectAtIndex: i];
        // work

Notice that the array’s count is fetched every time through the loop, involving an extra (and unnecessary) Objective-C message send, so two messages per loop iteration. You can hoist the count out of the loop:

    NSUInteger arrayCount = array.count;
    for (NSUInteger i = 0; i < arrayCount; i++) {
        NSNumber *number = [array objectAtIndex: i];
        // work

That saves one method call per object in the array. In the big scheme of things, it doesn’t save a lot of time. Iterating over that fifty-million item collection took about 1.86 seconds, and 1.68 seconds when hoisting out the count, or about a 10% penalty. I find the first form easier to read because there’s one less local variable to keep track of.

I sometimes use this form if I need the index, like when I’m traversing two parallel NSArrays. You can also use this form if you must change the array while iterating over it. The other iteration forms assume (and some enforce) that the collection is not mutated while it’s being iterated. ProTips: if you have a collection of indexes to remove, iterate through the array backwards so you don’t have to update the indexes, as you would if you deleted items from the front of the array. If these indexes are already in an NSIndexSet, or can be easily converted to one, you can use removeObjectsAtIndexes: and dispense with writing your own loop.

2. Old School Enumeration

In the beginning was NSEnumerator. This was the Official Object-Oriented Way of iterating through a collection. It’s pretty wordy, and it involves one object allocation and a message send for each object in the collection:

    NSEnumerator *enumerator = [array objectEnumerator];
    NSNumber *number;
    while ((number = [enumerator nextObject])) {
        // work

Performance wise, this lies between manually calling objectAtIndex and hoisting the count. That is 1.86 > 1.76 > 1.68 seconds. This makes sense – under the hood it’s probably running its own loop, getting the object by index and using the hoisted count, plus the extra message send necessary for nextObject and a little bookkeeping.

Code wise it’s pretty wordy, and it obscures the intent of the loop unless you’re really familiar with the idiom. I usually avoid this form these days.

NSEnumerators, though, have their uses. Some enumerators, like NSDirectoryEnumerator, have additional methods that you can query as you work through the collection. You can also ask NSArray for an enumerator that runs through the array backwards. You can pass this enumerator around, and also give it to fast enumeration.

I saw a one-liner tossed off on IRC the other way that makes a new array that’s the reverse of an existing one:

[[array reverseObjectEnumerator] allObjects];

That’s pretty cool, so don’t automatically discount NSEnumerator. But it’s no longer really necessary for day-to-day work.

3. Fast Enumeration

I’ve had a tiny little bit to say about fast enumeration in the past. You can read the gory details there. The fast enumeration loop looks like this:

    for (NSNumber *number in array) {
        // work

Of all the forms, this is syntactically the cleanest. It’s dead-obvious what’s going on, and there’s no extraneous fluff. Here’s the type I’m expecting, here’s the array, here’s the work. Plus it lives up to its name being fast. Spinning through 50 million objects only took 0.28 seconds, 15% of the very first loop’s time. As the youngun’s used to say, “Full of Epic Win”, or something like that.

4. Blocks

And finally, blocks. Oh blocks, how we love you. If you thought NSEnumeration was wordy, you haven’t used the block-based API:

   [array enumerateObjectsUsingBlock: ^(NSNumber *number, 
                                         NSUInteger index,
                                         BOOL *stop) {
        // work
        // Don't forget __block if you're mutating a variable in here.

That’s a lot of characters. It’s pretty straightforward once you get comfortable with the block declaration syntax : call this chunk of code for each element of the array, giving it the object, where it lives, and a back-channel to say that you found the droid you were looking for.

What you pay for in syntax you get back in flexibility. You have the index handy in case you’re rummaging through another array. You can easily terminate the loop.

Blocks in general are pretty nice – you can make a generic function or method that takes a block as a parameter. You can use this form in your generic function and just pass the block parameter to it. This makes it easy to write flexible programming interfaces.

There’s also a related, longer form of this call where you can pass some flags. There’s NSEnumerationReverse, which spins through the array backwards without messing around with reverseObjectEnumerator. And then there’s the thing that just about caused me to snoopy-dance when I saw it at WWDC when blocks were first unveiled : NSEnumerationConcurrent. Just add that flag, and suddenly your loop is spread across a bunch of cores:

    [array enumerateObjectsWithOptions: NSEnumerationConcurrent
           usingBlock: ^(NSNumber *number,
                         NSUInteger index, 
                         BOOL *stop) {
        // work
        // Don't forget you're in concurrent programming hell world now.

The UseBlockParallel() function in the iteration.m sample program just increments a counter, paying no attention to thread safety. Therefore it never calculates the right answer because the counter math needs some kind of locking around it.

Aside from fast enumeration, the block call is faster than the other forms, taking 1.46 seconds for those 50 million objects. The parallel form is a skootch faster at 1.13 seconds with no real work being done in the loop. Depending on your actual data and the work involved, you could see some decent speedups as more cores are used.


Here are the timings again in one place. They were calculated by averaging three runs over an array of 50 million unique NSNumbers in an NSArray – kick off a single run of the program that exercises the different iteration forms three times. Sit back and let it finish.

obj at index   hoisted   nsenumerator   fast enum   block    block concurrent
1.8648         1.6804    1.7604         0.2750      1.4629   1.1270

This is just looping overhead. For 50 million objects, that’s not a whole lot of time per object spent in the machinery. In general, fast enumeration is the winner in both performance and readability, and is the first tool I reach for.

The block syntax is flexible, letting you create generic APIs, plus makes it easy to do quickie parallelism. Just be very very careful before you include that NSEnumerationConcurrent flag, because you now have to care about threading issues.


  1. Ryan C. Payne

    Hey Mark! Thanks for putting this together after our discussion last week. It pretty much sums up the conclusions that we all talked through.

  2. SF

    The dude supporting for…in says thanks for the clarification.

  3. Andrea Gelati

    I have tried the same example using dispatch_apply but looks like to have a bad performance.

    • Interesting. Tried it with dispatch_apply, to a global queue of high priority, and got overhead about the same as the regular block. It’s probably the message call overhead, having to directly query the array with objectAtIndex. The concurrent block form might be able to take some shortcuts with techniques similar to fast enumeration, or some private knowledge of the array’s implementation.

  4. Steve

    You don’t need the if (number != nil) test in the first example.
    Maybe it’d go a little faster without it?
    Nice article, thank you.

    • That’s mainly there as a place holder for doing some work. Also to prevent any possible optimizations by the compiler that the return value from the iterations is being used. (not that I think the compiler would *actually* do that kind optimization, but the LLVM team is clever)

  5. Jordan

    This pretty clearly shows that the by-index version isn’t great, but I will note one way to hoist the count with a bit less cognitive overhead:

    for (NSUInteger i = 0, count = [array count]; i < count; ++i) { … }

    In C++ this is more important because finding the end() of a container might involve actual computation, but even in Objective-C it might be a good balance. If you can, though, either the for-in loop or the block-based API is probably better.

  6. Mike Llewellyn

    Great article and will be useful the next time (maybe in a few years) the performance debate comes up :)

    I almost always use for-in, and if I need the index I simply count that as we go too, which I expect will be faster than the C for loop, and I also think the intention is clearer in that case.

    I’ve yet to try the concurrency approach, hopefully soon.

  7. Mathew

    I’ve noticed that the results vary depending on what work is being done. I’ve done some varied tests and it seems like the biggest performance hit with the block enumerations is when you are accessing variables outside the scope of the block, using the __block qualifier. Removing this and just performing some simple math, the parallel block actually runs faster than fast enumeration.

    • Interesting – block parallel not using anything in the outside scope got down to 0.7386 seconds (vs still in the 0.2 range for fast enumeration) I couldn’t get it running faster than FE for pure overhead, but a pretty cool case I missed. Thanks!

  8. Dan Reese

    You can use Fast Enumeration directly with NSEnumerator. So you can avoid the allObjects call:

    for (NSNumber *number in [array reverseObjectEnumerator]) {

    It’s not as fast as NSEnumerationReverse, but is probably good enough for many cases and keeps the clean syntax.

  9. Joshua Emmons

    For what it’s worth, bbum disagrees:

    enumerateObjectsUsingBlock: will be as fast or faster than fast enumeration (for(… in …) uses the NSFastEnumeration support to implement enumeration). Fast enumeration requires translation from an internal representation to the representation for fast enumeration. There is overhead therein. Fast enumeration allows the collection class to enumerate contents as quickly as the fastest traversal of the native storage format. Likely irrelevant for arrays, but it can be a huge difference for dictionaries.

    • Interesting. Bbum knows his stuff, and I usually treat his words like gospel. That posting was from 2010 (ages ago in internet time), and my benchmarks say otherwise. I’m perfectly willing to accept they’re wrong, but I’ll need a concrete counterexample :-)

      • Todd Barker

        I have just run tests over the last few days and I have also found that NSFastEnumeration gives the best performance. If someone has examples that enumerateObjectsUsingBlock is better in some cases I would also like to see a “concrete counterexample”.

Leave a Comment

Join the discussion. Do not worry, your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>