Skip navigation.
 
mlRe: Leopard performance penalty (3x slower), NSPopAutoreleasePool
FROM : Ben Trumbull
DATE : Mon Nov 19 03:45:03 2007

>> On Nov 17, 2007, at 10:53 AM, Gerd Knops wrote:
>>

>>> You should probably investigate trying to create fewer temporary

>> objects, and fewer autoreleased objects, as a way to fix this
>> problem on your end.
>>

> That would be a less than fun task, given >50.000 lines of code...
>
> I did sprinkle a number of local NSAutoReleasePools, so that
> autoreleased objects do not amass. That made no (measurable)
> difference in performance at all. I no longer have the 4 second period
> of unresponsiveness, but the overall process takes 4 seconds longer,
> so that was a wash. By adding timestamps before and after the
> [autoReleasePool release] I can see that they now roughly share the
> burden (eg take about the same time). So having all the temporary
> objects in one large pool or a number of smaller pools makes no
> difference.


If Shark is reporting a lot of time popping an autorelease pool, then 
using fewer temporary/autoreleased objects is obvious.

-autorelease is a convenience.  It's handy, saves a couple lines of 
code, can provide a simpler API, and can eliminate extra code in 
places you expect exceptions to be thrown.

It's also, relative to just -release, expensive.  In addition to all 
the extra work to keep track of the object until the pool is released, 
it requires more memory.  The memory for a temporary object can't be 
reused for something else until you actually release it.  So an 
autorelease pool extends the lifetime of those objects.  Sometimes 
that's a useful feature.  It can make an API less error prone. 
Sometimes it's just a performance drain.  Unnecessarily extending the 
lifetime of memory blocks means that the process needs more memory. 
It needs to compensate for all the temporary blocks that are pending 
in the autorelease pool, but never used again.  Growing the high 
watermark of a process heap means that more memory needs to be 
allocated from the OS.  That is much much much more expensive than -
release.

So using autoreleased objects within performance critical loops is 
very counterproductive.

It also sounds like you're allocating a large number of objects.  You 
may be able to improve performance by allocating fewer objects 
overall.  This is more work than simply releasing temporary objects 
more aggressively, but can provide additional improvements.  Depending 
on what you need, you can reuse allocated objects (allocate a mutable 
object, and mutate instead), malloc a large buffer and divide it up 
yourself, or use the batch malloc APIs in malloc.h

- Ben