CFDictionary callback on PPC vs Intel
-
Hello all,
The included semi-pseudo code works on my Macbook Pro (core duo
32-bit) and doesn't work on my powermac g5.
Macbook Pro behavior: Gets or creates the mref and adds it to the args_set.
Powermac G5 behavior: The CFNumberRef mref never changes after the
first run no matter what the CFStringRef smref value is. The strange
thing is it does make it inside the if statement from the
CFDictionaryGetValueIfPresent (according to debugging) but the mref
pointer never changes. Thus, the args_set only ever has one pointer
value in it no matter the data available.
The following is a stripped down version that probably doesn't compile
(missing file pointer and such) as I copied and pasted only the
relevant parts:
/* Begin code snippet */
CFMutableDictionaryRef args = CFDictionaryCreateMutable
(NULL,0,&kCFCopyStringDictionaryKeyCallBacks,NULL);
__strong CFMutableSetRef args_set = CFSetCreateMutable(NULL,0,NULL);
while (fgets(line, LINE_MAX, fp) != NULL && int < 26000) {
int = int +1;
char *m = strtok(line,sep);
int im = atoi(m);
CFNumberRef mref;
CFStringRef smref = CFStringCreateWithCString(NULL, m, kCFStringEncodingUTF8);
CFStringRef srref = CFStringCreateWithCString(NULL, r, kCFStringEncodingUTF8);
if (!CFDictionaryGetValueIfPresent(args, smref, (const void **)&mref)) {
CFNumberRef nmref = CFMakeCollectable(CFNumberCreate(NULL,
kCFNumberSInt16Type, &im));
CFDictionaryAddValue(args, smref, nmref);
CFDictionaryGetValueIfPresent(args, smref, (const void **)&mref);
}
CFRelease(smref);
CFSetAddValue(args_set, mref);
}
/* End code snippet */
In case you are wondering why I want to do this: The file I'm reading
in can be many gigabytes in size. It is a csv representation of a
database in which each row uses a few different values over and over.
I basically need unique pointers to each value. In most cases the
value itself is not needed at all. Doing this allowed me to turn the
text of a value in a csv into an existing pointer to an in memory
object, and if it doesn't exist, create it.
Any ideas why this would work on intel and not on ppc?
Or, better yet, anyone have a better way to do this? I'm new at this
and would appreciate any suggestions code clarifications and memory
management help. I am using garbage collection.
Thank you!
--Derrek
http://www.allofzero.com/~dleute/
<dleute...> (preferred contact)
home: (802) 347-1573
cell: (516) 528-4619 -
On Feb 17, 2008 10:51 AM, Derrek Leute <dleute...> wrote:> int im = atoi(m);
You declare im as an int.> CFNumberRef nmref = CFMakeCollectable(CFNumberCreate(NULL,
> kCFNumberSInt16Type, &im));
But tell CFNumber it's a signed short. The only reason it ever worked
is by luck. Don't do that.
Since this is cocoa-dev, may I suggest using Cocoa instead of CF? As
far as I can see you aren't taking advantage of anything CF offers
above what Cocoa offers, and Cocoa is nicer-looking and will
incidentally avoid this error since NSNumber takes parameters by value
instead of by reference.
Mike -
> You declare im as an int.
>
>> CFNumberRef nmref = CFMakeCollectable(CFNumberCreate(NULL,
>> kCFNumberSInt16Type, &im));
>
> But tell CFNumber it's a signed short. The only reason it ever worked
> is by luck. Don't do that.
Wow. had to be that simple. :) So, to sum up, it wasn't setting the
value (the key was fine) because my types didn't match up. I get it.
:)
The project actually started in cocoa but I moved to CF because
NSDictionary forces a string copy for keys. This made memory needs go
through the roof (unless I'm misusing or misunderstanding how to use
it). CF lets me use anything as a key. In some sense I would love to
be in Cocoa as I had to reimplement NSSet intersection and some other
little basic things to do this. But it's fast, and it seems to be
working quite well now. Not to mention that pointer comparison also
seemed ridiculously faster on this amount of data.
Thanks so much!>
> Since this is cocoa-dev, may I suggest using Cocoa instead of CF? As
> far as I can see you aren't taking advantage of anything CF offers
> above what Cocoa offers, and Cocoa is nicer-looking and will
> incidentally avoid this error since NSNumber takes parameters by value
> instead of by reference.
>
> Mike
> -
On Feb 17, 2008, at 10:43 AM, Derrek Leute wrote:> The project actually started in cocoa but I moved to CF because
> NSDictionary forces a string copy for keys. This made memory needs go
> through the roof (unless I'm misusing or misunderstanding how to use
> it). CF lets me use anything as a key.
Close but not quite exactly correct.
NSDictionary requires that keys conform to the NSCopying protocol
because it make a copy of the key (regardless of if it is an NSString,
or whatever). The important part is that for a non-immutable
NSString, it will simply increase the reference count - no increase in
the memory usage at all. If you pass an NSMutableString as a key, it
will make a (immutable) copy that does, in fact, increase memory usage.
You may wonder why it does this - consider this:
NSMutableString *s1 = [NSMutableString stringWithString: @"abc"];
NSMutableString *s2 = [NSMutableString stringWithString: @"def"];
NSMutableDictionary *d = [NSMutableDictionary dictionary];
[d setObject; [NSNumber numberWithInt: 1] forKey: s1];
[d setObject: [NSNumber numberWithInt: 2] forKey: s2];
[s1 setString: @"def"];
NSLog(@"%@", [d objectForKey: @"def"]);
If it didn't make an immutable copy, you would end up with two values
that both have the key "def".
So, if you use immutable objects as keys, your memory usage won't go
up. If you use mutable objects as keys, it will make a copy because
if it didn't, bad things would happen.
Glenn Andreas <gandreas...>
<http://www.gandreas.com/> wicked fun!
quadrium | prime : build, mutate, evolve, animate : the next
generation of fractal art -
On Feb 17, 2008 11:43 AM, Derrek Leute <dleute...> wrote:> The project actually started in cocoa but I moved to CF because
> NSDictionary forces a string copy for keys. This made memory needs go
> through the roof (unless I'm misusing or misunderstanding how to use
> it). CF lets me use anything as a key. In some sense I would love to
> be in Cocoa as I had to reimplement NSSet intersection and some other
> little basic things to do this. But it's fast, and it seems to be
> working quite well now. Not to mention that pointer comparison also
> seemed ridiculously faster on this amount of data.
Check out toll free bridging:
http://www.cocoadev.com/index.pl?TollFreeBridging
In short, a lot of CF types are equivalent to their NS types and can
be "converted" simply by casting. I put "converted" in quotes because
no conversion takes place; the same object is simultaneously an NS
object and a CF object.
Note that a CFDictionary created with custom callbacks will still copy
its keys if you use -setObject:forKey:, so don't do that. But
otherwise you can take a CFSet and use NSSet methods, you can put
NSNumbers in your CFDictionaries, and so forth. As far as I know, all
custom callbacks will be respected aside from NSDictionary insisting
on creating copies of the keys.
Mike -
On Feb 17, 2008, at 9:51 AM, Derrek Leute wrote:> Hello all,Huh? How can you pass a reference to a CONST
>
> The included semi-pseudo code works on my Macbook Pro (core duo
> 32-bit) and doesn't work on my powermac g5.
>
> Macbook Pro behavior: Gets or creates the mref and adds it to the
> args_set.
>
> Powermac G5 behavior: The CFNumberRef mref never changes after the
> first run no matter what the CFStringRef smref value is. The strange
> thing is it does make it inside the if statement from the
> CFDictionaryGetValueIfPresent (according to debugging) but the mref
> pointer never changes. Thus, the args_set only ever has one pointer
> value in it no matter the data available.
>
> The following is a stripped down version that probably doesn't compile
> (missing file pointer and such) as I copied and pasted only the
> relevant parts:
>
> /* Begin code snippet */
>
> CFMutableDictionaryRef args = CFDictionaryCreateMutable
> (NULL,0,&kCFCopyStringDictionaryKeyCallBacks,NULL);
(kCFCopyStringDictionaryKeyCallBacks)? I thought a CONST was just a
compiler replacement at compile-time, and doesn't actually have any
storage. Or is this not a CONST?> __strong CFMutableSetRef args_set = CFSetCreateMutable(NULL,0,NULL);
>
> while (fgets(line, LINE_MAX, fp) != NULL && int < 26000) {
>
> int = int +1;
> char *m = strtok(line,sep);
> int im = atoi(m);
>
> CFNumberRef mref;
> CFStringRef smref = CFStringCreateWithCString(NULL, m,
> kCFStringEncodingUTF8);
> CFStringRef srref = CFStringCreateWithCString(NULL, r,
> kCFStringEncodingUTF8);
> if (!CFDictionaryGetValueIfPresent(args, smref, (const void **)
> &mref)) {
> CFNumberRef nmref = CFMakeCollectable(CFNumberCreate(NULL,
> kCFNumberSInt16Type, &im));
> CFDictionaryAddValue(args, smref, nmref);
> CFDictionaryGetValueIfPresent(args, smref, (const void **)&mref);
> }
> CFRelease(smref);
>
> CFSetAddValue(args_set, mref);
> }
>
> /* End code snippet */
>
> In case you are wondering why I want to do this: The file I'm reading
> in can be many gigabytes in size. It is a csv representation of a
> database in which each row uses a few different values over and over.
> I basically need unique pointers to each value. In most cases the
> value itself is not needed at all. Doing this allowed me to turn the
> text of a value in a csv into an existing pointer to an in memory
> object, and if it doesn't exist, create it.
>
> Any ideas why this would work on intel and not on ppc?
>
> Or, better yet, anyone have a better way to do this? I'm new at this
> and would appreciate any suggestions code clarifications and memory
> management help. I am using garbage collection.
>
> Thank you!
>
> --Derrek
>
> http://www.allofzero.com/~dleute/
>
> <dleute...> (preferred contact)
> home: (802) 347-1573
> cell: (516) 528-4619 -
>> CFMutableDictionaryRef args = CFDictionaryCreateMutable
>> (NULL,0,&kCFCopyStringDictionaryKeyCallBacks,NULL);
> Huh? How can you pass a reference to a CONST
> (kCFCopyStringDictionaryKeyCallBacks)? I thought a CONST was just a
> compiler replacement at compile-time, and doesn't actually have any
> storage. Or is this not a CONST?
I'm just doing as the apple documentation and examples do. They say
pass it as a reference so I do. That's all I know. :)
--Derrek -
This is excellent to know. What about performance? Performance is
*very* important in this case. A 1% increase can knock hours or days
off of processing time on this size data set. Any performance metrics
available?
--Derrek
On Feb 17, 2008 11:54 AM, glenn andreas <gandreas...> wrote:>
> On Feb 17, 2008, at 10:43 AM, Derrek Leute wrote:
>> The project actually started in cocoa but I moved to CF because
>> NSDictionary forces a string copy for keys. This made memory needs go
>> through the roof (unless I'm misusing or misunderstanding how to use
>> it). CF lets me use anything as a key.
>
> Close but not quite exactly correct.
>
> NSDictionary requires that keys conform to the NSCopying protocol
> because it make a copy of the key (regardless of if it is an NSString,
> or whatever). The important part is that for a non-immutable
> NSString, it will simply increase the reference count - no increase in
> the memory usage at all. If you pass an NSMutableString as a key, it
> will make a (immutable) copy that does, in fact, increase memory usage.
>
> You may wonder why it does this - consider this:
>
> NSMutableString *s1 = [NSMutableString stringWithString: @"abc"];
> NSMutableString *s2 = [NSMutableString stringWithString: @"def"];
> NSMutableDictionary *d = [NSMutableDictionary dictionary];
>
> [d setObject; [NSNumber numberWithInt: 1] forKey: s1];
> [d setObject: [NSNumber numberWithInt: 2] forKey: s2];
> [s1 setString: @"def"];
>
> NSLog(@"%@", [d objectForKey: @"def"]);
>
> If it didn't make an immutable copy, you would end up with two values
> that both have the key "def".
>
> So, if you use immutable objects as keys, your memory usage won't go
> up. If you use mutable objects as keys, it will make a copy because
> if it didn't, bad things would happen.
>
>
> Glenn Andreas <gandreas...>
> <http://www.gandreas.com/> wicked fun!
> quadrium | prime : build, mutate, evolve, animate : the next
> generation of fractal art
>
>
>
> -
On Feb 17, 2008, at 08:43, Derrek Leute wrote:> The project actually started in cocoa but I moved to CF because
> NSDictionary forces a string copy for keys. This made memory needs go
> through the roof (unless I'm misusing or misunderstanding how to use
> it). CF lets me use anything as a key. In some sense I would love to
> be in Cocoa as I had to reimplement NSSet intersection and some other
> little basic things to do this. But it's fast, and it seems to be
> working quite well now. Not to mention that pointer comparison also
> seemed ridiculously faster on this amount of data.
Sorry if I'm being dense here, but I don't see how it's NSDictionary's
fault. NSDictionary is only going to copy the key when you insert
something, and (as somebody already pointed out) if the key string is
immutable there may be no actual copying.
It looks like the real problem is that you're creating a large number
of temporary objects (the strings you use to look up the dictionary)
with a lifetime of 1 loop iteration, but doing nothing to reclaim the
unused memory (in the code snippet you showed us, at least). Judicious
use of GC's collectIfNeeded might take care of that.
The reason CFDictionary seems to work better is nothing to do with
string copying per se. It's actually that innocent-looking CFRelease,
which reclaims the memory you used for the temporary string at each
iteration of the loop. In effect, you've switched from GC to pseudo-
non-GC mode for the duration of the loop. :) That's why memory usage
stays moderate. -
It was mostly the immutable issue. I was using mutable strings when I
used NSDictionary as I was unaware that it would simply retain the
immutable version instead of copying them (is this documented
somewhere that I missed?).
When I was using NSDictionary I believe I was releasing the strings in
a balanced fashion. I do realize I'm mixing GC with non-GC which
probably isn't very comprehensible. But there are no apparent leaks
and it is getting the expected output.
Right now it's running very well. The next version I'll try cocoa and
see how I do. :) As this was just a command line utility, I didn't see
an overwhelming need to go beyond core foundation.
Thanks for everyone's help!
On Feb 17, 2008 3:05 PM, Quincey Morris <quinceymorris...> wrote:>
> On Feb 17, 2008, at 08:43, Derrek Leute wrote:
>
>> The project actually started in cocoa but I moved to CF because
>> NSDictionary forces a string copy for keys. This made memory needs go
>> through the roof (unless I'm misusing or misunderstanding how to use
>> it). CF lets me use anything as a key. In some sense I would love to
>> be in Cocoa as I had to reimplement NSSet intersection and some other
>> little basic things to do this. But it's fast, and it seems to be
>> working quite well now. Not to mention that pointer comparison also
>> seemed ridiculously faster on this amount of data.
>
> Sorry if I'm being dense here, but I don't see how it's NSDictionary's
> fault. NSDictionary is only going to copy the key when you insert
> something, and (as somebody already pointed out) if the key string is
> immutable there may be no actual copying.
>
> It looks like the real problem is that you're creating a large number
> of temporary objects (the strings you use to look up the dictionary)
> with a lifetime of 1 loop iteration, but doing nothing to reclaim the
> unused memory (in the code snippet you showed us, at least). Judicious
> use of GC's collectIfNeeded might take care of that.
>
> The reason CFDictionary seems to work better is nothing to do with
> string copying per se. It's actually that innocent-looking CFRelease,
> which reclaims the memory you used for the temporary string at each
> iteration of the loop. In effect, you've switched from GC to pseudo-
> non-GC mode for the duration of the loop. :) That's why memory usage
> stays moderate.
>


