Locating managed objects within ObjectAlloc (was Re: Garbage collection, core data, and tight loops)
-
On Nov 2, 2007, at 3:01 PM, John R. Timmer wrote:> Using garbage collection resulted in a significant memory gain, but
> nowhere near bad enough to crash the program. Oddly, the memory use
> did not subside after the loop had finished.
This sounds like caching behavior that I think I've observed. My app
create millions of managed objects during an import process, and I've
seen large memory usage that doesn't go down when the objects are
dealloced. Looking at backtraces in ObjectAlloc, I see a lot of
mallocs in caching functions as various objects are dealloc'ed. If I
close the NSPersistentDocument, I get some memory reclaimed, but not
what I expect. But, if I repeated run the import (without quitting),
the peak memory doesn't go up, so the memory is apparently reused.
This leads me to a question about using ObjectAlloc (in Instruments on
Leopard) effectively, especially with respect to Core Data. When
looking at allocations, there are a large number of GeneralBlocks of
various sizes, but no NSManagedObjects. I have some custom
subclasses, too, and those aren't identified either in the Categories
column. I want to observe the usage patterns of the various entities
in my model, but I'm not sure how to do that. Is there an easy way to
filter managed objects?
Thanks,
----
Aaron Burghardt
<aburgh...> -
> This sounds like caching behavior that I think I've observed. My app
> create millions of managed objects during an import process, and I've
> seen large memory usage that doesn't go down when the objects are
> dealloced.
You mean RSIZE as reported by 'top', not malloc's free heap space as
reported by 'heap'.> Looking at backtraces in ObjectAlloc, I see a lot of
> mallocs in caching functions as various objects are dealloc'ed. If I
> close the NSPersistentDocument, I get some memory reclaimed, but not
> what I expect. But, if I repeated run the import (without quitting),
> the peak memory doesn't go up, so the memory is apparently reused.
Correct. The memory is freed (heap space), but not returned to the
kernel (VM mapping). This is the behavior of malloc on OSX. You will
see a high watermark effect.> This leads me to a question about using ObjectAlloc (in Instruments on
> Leopard) effectively, especially with respect to Core Data. When
> looking at allocations, there are a large number of GeneralBlocks of
> various sizes, but no NSManagedObjects. I have some custom
> subclasses, too, and those aren't identified either in the Categories
> column. I want to observe the usage patterns of the various entities
> in my model, but I'm not sure how to do that. Is there an easy way to
> filter managed objects?
No, allocation events for managed objects are not currently tracked by
ObjectAlloc except as general blocks. The 'heap' and 'leaks' tools
both perceive managed objects. You'll probably find the 'heap' tool
useful for experimenting with these questions. Also, 'malloc_history'
is much improved on Leopard and often overlooked.
- Ben -
On Nov 3, 2007, at 7:08 PM, Ben Trumbull wrote:>>
> You mean RSIZE as reported by 'top', not malloc's free heap space as
> reported by 'heap'.
Actually, I was observing VSIZE (which I thought should track with
RSIZE as long as the system isn't swapping my app) and also the graph
in ObjectAlloc, which appeared to correlate with VSIZE. In other
words, even the graph of all allocations in ObjectAlloc shows only a
small reduction when the document is closed. 'heap' is the tool I
needed, though. Thanks!>> the peak memory doesn't go up, so the memory is apparently reused.I've read that, but forgot to consider it in this case. However, I'm
>
> Correct. The memory is freed (heap space), but not returned to the
> kernel (VM mapping). This is the behavior of malloc on OSX. You
> will see a high watermark effect.
>
confused by the test case below. When the memory is malloc'ed, VSIZE
goes up. When the memory is used, RSIZE goes up. When the memory is
freed, both VSIZE and RSIZE go back down. Does this contradict what
you are saying?
#include <stdlib.h>
int main( int argc, char *argv[])
{
sleep(5);
char *ptr;
while (1) {
ptr = malloc(10 * 1024 * 1024);
sleep(10);
int i;
for( i = 0; i < (10 * 1024 * 1024); i++)
*(ptr + i) = 'x';
sleep(10);
free(ptr);
sleep(10);
}
}>
> No, allocation events for managed objects are not currently tracked
> by ObjectAlloc except as general blocks. The 'heap' and 'leaks'
> tools both perceive managed objects. You'll probably find the
> 'heap' tool useful for experimenting with these questions. Also,
> 'malloc_history' is much improved on Leopard and often overlooked.
I was using 'leaks' and had a few leaks, but nothing that explained
the large amount of memory I was using, so I suspected I had a retain
cycle with some managed objects. Ultimately, I found I forgot to add
an autorelease pool to one of my loops.
You are right, 'heap' would have been very useful for this. I've used
malloc_history for an over-release/BAD_EXEC bug, and it was great for
that.
Thanks, Ben.
Aaron -
On Nov 4, 2007, at 3:27 AM, <ajb.lists...> wrote:>> Correct. The memory is freed (heap space), but not returned to the
>> kernel (VM mapping). This is the behavior of malloc on OSX. You
>> will see a high watermark effect.
>>
> I've read that, but forgot to consider it in this case. However,
> I'm confused by the test case below. When the memory is malloc'ed,
> VSIZE goes up. When the memory is used, RSIZE goes up. When the
> memory is freed, both VSIZE and RSIZE go back down. Does this
> contradict what you are saying?
No. The relationship is more complicated. For large blocks of
memory, malloc will return the allocation back to the kernel. For
small blocks, it will not, but instead coalesce the free space for
reuse later. The objects used in Core Data are almost exclusively
small. Amit Singh's book describes the malloc algorithms in
considerable detail.
<http://www.amazon.com/Mac-OS-Internals-Systems-Approach/dp/0321278542/ref=s
r_1_1/103-1158819-7883057?ie=UTF8&s=books&qid=1194215432&sr=1-1>> ptr = malloc(10 * 1024 * 1024);
Try again with something closer to the size of your managed objects.
Like 96-256 bytes.
Generally, I find RSIZE to reflect the high watermark of the process,
and use 'heap' to examine the currently free space within it. Your
mileage may vary. There's some pretty vociferous disagreement over
the "ideal" way to measure memory utilization (available v.s. vm
deallocated v.s. internal fragmentation, etc)
- Ben -
On Nov 4, 2007, at 5:36 PM, Ben Trumbull wrote:>
> On Nov 4, 2007, at 3:27 AM, <ajb.lists...> wrote:
>>> No. The relationship is more complicated. For large blocks of
> memory, malloc will return the allocation back to the kernel. For
> small blocks, it will not, but instead coalesce the free space for
> reuse later. The objects used in Core Data are almost exclusively
> small. Amit>> ptr = malloc(10 * 1024 * 1024);following this thread, the following test looks different in Activity
>
> Try again with something closer to the size of your managed
> objects. Like 96-256 bytes.
>
> Generally, I find RSIZE to reflect the high watermark of the
> process, and use 'heap' to examine the currently free space within
> it. Your mileage may vary. There's some pretty vociferous
> disagreement over the "ideal" way to measure memory utilization
> (available v.s. vm deallocated v.s. internal fragmentation, etc)
>
Yup, even 10 KB mallocs were small enough. For anyone that may be
Monitor (and and presumably ObjectAlloc). Once the VSIZE and RSIZE go
up, they stay at that level until the test quits:
#include <stdlib.h>
int main( int argc, char *argv[])
{
sleep(5);
char *ptr[1000];
int i;
while (1) {
for( i = 0; i < 1000; i++)
ptr[i] = malloc(10 * 1024);
sleep(10);
int j;
for( i = 0; i < 1000; i++) {
for( j = 0; j < (10 * 1024); j++)
*(ptr[i] + j) = 'x';
}
sleep(10);
for( i = 0; i < 1000; i++)
free(ptr[i]);
sleep(10);
}
}
Thanks again,
Aaron -
On Nov 4, 2007, at 7:01 PM, Aaron Burghardt wrote:> Yup, even 10 KB mallocs were small enough. For anyone that may be
> following this thread, the following test looks different in
> Activity Monitor (and and presumably ObjectAlloc). Once the VSIZE
> and RSIZE go up, they stay at that level until the test quits:
And do not represent "the memory used by the test," but rather "the
address space used by the test." These are very different things.
Address space can be in use with no physical memory backing it, and as
Ben said, can represent a high-water mark.
In top(1) terms, the best measure of actual memory use for a process
is RPRVT, not RSIZE. As seen on the top(1) man page:
RPRVT - Resident private memory size.
RSIZE - Total resident memory size, including shared pages.
VSIZE - Total address space allocated, including shared pages.
"Shared" pages are memory pages that are mapped into multiple address
spaces, typically read-only. These include the system frameworks and
window back-buffers handed out by the window server.
If I look at, say, a fresh launch of TextEdit on my system right now,
it has the following:
PID COMMAND RPRVT RSIZE VSIZE
6596 TextEdit 1680K 6324K 358M
Thus even though it has 358M of its address space assigned, and an
RSIZE of 6324K, only 1680K of that is really "unique" to TextEdit.
And doing a "heap TextEdit" corroborates this; the number of bytes
allocated is very close to the RPRVT value. (I think the RPRVT value
is calculated in terms of allocated pages, rather than allocated
bytes, whereas I think heap looks at the malloc statistics which are
in terms of bytes.)
-- Chris


