crashes loading saved file

  • I've been having problems with my app crashing with an EXC_BAD_ACCESS while unarchiving a saved data file. The file is a graph representation of musical structure, created by a machine learning algorithm. When the file/graph is small there are no problems, but as I add more training material, and the file increases in size, at a certain point it starts crashing during unarchiving. Strangely, it has no problems saving the file, only unarchiving. The file is saved using:

    [NSKeyedArchiver archiveRootObject:model toFile:filePath];

    Pretty straightforward. NSZombieEnabled gives no info, and code analysis reveals no memory warnings. I've been over the code many, many times, and haven't been able to track down a reasonable cause. The graph does have circular references/loops in some cases (i.e., node B points to a "parent" node A, which holds a reference to node B as a "child"), but I doubt that's the problem, since smaller files would have the same basic structure -- I used the archiveRootObject, which is supposed to deal with this situation (in my understanding).

    The last 30 frames of the backtrace:

    * thread #1: tid = 0x2603, 0x959f115c CoreFoundation`__CFStringEncodeByteStream + 12, stop reason = EXC_BAD_ACCESS (code=2, address=0xbf81acfc)
        frame #0: 0x959f115c CoreFoundation`__CFStringEncodeByteStream + 12
        frame #1: 0x95a27a0a CoreFoundation`CFStringGetCString + 922
        frame #2: 0x95a683d7 CoreFoundation`-[__NSCFString getCString:maxLength:encoding:] + 119
        frame #3: 0x9b693190 Foundation`NSClassFromString + 82
        frame #4: 0x9b6cb4bf Foundation`_decodeObjectBinary + 2191
        frame #5: 0x9b6ccec9 Foundation`-[NSKeyedUnarchiver _decodeArrayOfObjectsForKey:] + 1533
        frame #6: 0x9b6a06e7 Foundation`-[NSArray(NSArray) initWithCoder:] + 693
        frame #7: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #8: 0x9b6caa66 Foundation`_decodeObject + 197
        frame #9: 0x0014c017 ManuScore`-[CbCMNode initWithCoder:] + 663 at CbCMNode.m:1176
        frame #10: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #11: 0x9b6ccec9 Foundation`-[NSKeyedUnarchiver _decodeArrayOfObjectsForKey:] + 1533
        frame #12: 0x9b6a06e7 Foundation`-[NSArray(NSArray) initWithCoder:] + 693
        frame #13: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #14: 0x9b6caa66 Foundation`_decodeObject + 197
        frame #15: 0x0014c017 ManuScore`-[CbCMNode initWithCoder:] + 663 at CbCMNode.m:1176
        frame #16: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #17: 0x9b6ccec9 Foundation`-[NSKeyedUnarchiver _decodeArrayOfObjectsForKey:] + 1533
        frame #18: 0x9b6a06e7 Foundation`-[NSArray(NSArray) initWithCoder:] + 693
        frame #19: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #20: 0x9b6caa66 Foundation`_decodeObject + 197
        frame #21: 0x0014c017 ManuScore`-[CbCMNode initWithCoder:] + 663 at CbCMNode.m:1176
        frame #22: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #23: 0x9b6ccec9 Foundation`-[NSKeyedUnarchiver _decodeArrayOfObjectsForKey:] + 1533
        frame #24: 0x9b6a06e7 Foundation`-[NSArray(NSArray) initWithCoder:] + 693
        frame #25: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #26: 0x9b6caa66 Foundation`_decodeObject + 197
        frame #27: 0x0014c017 ManuScore`-[CbCMNode initWithCoder:] + 663 at CbCMNode.m:1176
        frame #28: 0x9b6cb9c0 Foundation`_decodeObjectBinary + 3472
        frame #29: 0x9b6ccec9 Foundation`-[NSKeyedUnarchiver _decodeArrayOfObjectsForKey:] + 1533
        frame #30: 0x9b6a06e7 Foundation`-[NSArray(NSArray) initWithCoder:] + 693

    This morning, I tried enabling Guard Malloc (on its own, without zombies), and was surprised to see the app crash during training, with the following error:

    GuardMalloc[ManuScore-2438]: Failed to VM allocate 1864016 bytes
    GuardMalloc[ManuScore-2438]: Explicitly trapping into debugger!!!

    Is it simply running out of VM while trying to build the graph? If so, why doesn't this happen with Guard Malloc off? Also, with zombies and guard malloc off, why is it only when reading the file that the app crashes, not during training (i.e., while the graph is being built)?

    One thing I have noticed, that seems pretty weird, is that the complete backtrace when it crashes during unarchiving is 25962 frames long! Could it simply be that it's running out of memory while trying to unarchive (i.e., on the stack)? If so, how can I get around that? Some sort of caching, perhaps?
    The file is only 9.6 MB, so it's not a massive file...

    Any thoughts appreciated.

    J.

    ------------------------------------------------------
    James B. Maxwell
    Composer/Researcher/PhD Candidate
  • On 9 May 2012, at 1:58 PM, James Maxwell wrote:

    > This morning, I tried enabling Guard Malloc (on its own, without zombies), and was surprised to see the app crash during training, with the following error:
    >
    > GuardMalloc[ManuScore-2438]: Failed to VM allocate 1864016 bytes
    > GuardMalloc[ManuScore-2438]: Explicitly trapping into debugger!!!
    >
    >
    > Is it simply running out of VM while trying to build the graph? If so, why doesn't this happen with Guard Malloc off? Also, with zombies and guard malloc off, why is it only when reading the file that the app crashes, not during training (i.e., while the graph is being built)?
    >
    > One thing I have noticed, that seems pretty weird, is that the complete backtrace when it crashes during unarchiving is 25962 frames long! Could it simply be that it's running out of memory while trying to unarchive (i.e., on the stack)? If so, how can I get around that? Some sort of caching, perhaps?
    > The file is only 9.6 MB, so it's not a massive file...

    1. What kind of memory-management scheme are you using? Garbage collection, ARC, or retain-release?

    2. Assuming (praying) that you're on a version-control system, consider creating a branch and converting the project to ARC. It will make better memory-management choices than you will. See if that helps.

    3. Are you accumulating lots of temporary (autoreleased) objects? Can you investigate embedding code inside loops in @autoreleasepool{...} blocks? @autoreleasepool is not an ARC feature. It's usable in any OS target, and ARC / ARC conversion don't create local autorelease pools for you.

    4. What does the Allocations template in Instruments tell you?

    5. Failing all of that, would you be comfortable posting your -initWithCoder: methods?

    — F

    --
    Fritz Anderson -- Xcode 4 Unleashed: Due 21 May 2012 -- <http://x4u.manoverboard.org/>
  • On May 9, 2012, at 12:58 PM, James Maxwell wrote:

    > I've been having problems with my app crashing with an EXC_BAD_ACCESS while unarchiving a saved data file. The file is a graph representation of musical structure, created by a machine learning algorithm. When the file/graph is small there are no problems, but as I add more training material, and the file increases in size, at a certain point it starts crashing during unarchiving.

    Are you building for OS X or iOS? If the former, then are you building for X86 or X86-64? If the former, then have you considered transitioning? :)

    > This morning, I tried enabling Guard Malloc (on its own, without zombies), and was surprised to see the app crash during training, with the following error:
    >
    > GuardMalloc[ManuScore-2438]: Failed to VM allocate 1864016 bytes
    > GuardMalloc[ManuScore-2438]: Explicitly trapping into debugger!!!
    >
    >
    > Is it simply running out of VM while trying to build the graph?

    Yes.

    > If so, why doesn't this happen with Guard Malloc off?

    Turning Guard Malloc on greatly increases the overall memory usage of the process.

    > Also, with zombies and guard malloc off, why is it only when reading the file that the app crashes, not during training (i.e., while the graph is being built)?

    You can see for yourself by running your app in Instruments using the object allocations instrument. This is just a guess, but what might be happening is the reading process is directly or indirectly generating many temporary objects that are being pushed into the autorelease/collection pool, which is bulging after a while. If you're using RR or ARC memory management, then you can fix this by grouping code that generates temporary objects into @autoreleasepool blocks. If you're using GC memory management, then you might need to manually run the collector more often.

    Nick Zitzmann
    <http://www.chronosnet.com/>
  • On May 9, 2012, at 11:58 AM, James Maxwell wrote:

    > Pretty straightforward. NSZombieEnabled gives no info, and code analysis reveals no memory warnings. I've been over the code many, many times, and haven't been able to track down a reasonable cause.

    Consider using valgrind. It’s pretty easy to use and can find all sorts of edge cases in code, since it’s basically running it in a simulator and watching every memory access like a hawk.

    You can install it using your favorite package manager (brew, port, etc.)

    —Jens
  • Thanks All,

    I'm just using retain-release, so I'll look into converting a version to ARC. Unfortunately, I can't go 64-bit as I'm using DrawKit heavily, which doesn't seem to support 64-bit builds. I'll also look into @autoreleasepool in certain places. I do have a tendency to use the factory methods from NSNumber quite often, so there may be cases where I'm creating lots of autoreleased NSNumber objects.
    However, I have to admit that I'm not sure how to fix the unarchiving process, since the code runs fine while all the learning is happening and the graph is being built. It seems to be the unarchiving process in particular that's a problem. Is there some other way of periodically draining the pool during the unarchiving process?

    J.

    On 2012-05-09, at 12:55 PM, Fritz Anderson wrote:

    > On 9 May 2012, at 1:58 PM, James Maxwell wrote:
    >
    >> This morning, I tried enabling Guard Malloc (on its own, without zombies), and was surprised to see the app crash during training, with the following error:
    >>
    >> GuardMalloc[ManuScore-2438]: Failed to VM allocate 1864016 bytes
    >> GuardMalloc[ManuScore-2438]: Explicitly trapping into debugger!!!
    >>
    >>
    >> Is it simply running out of VM while trying to build the graph? If so, why doesn't this happen with Guard Malloc off? Also, with zombies and guard malloc off, why is it only when reading the file that the app crashes, not during training (i.e., while the graph is being built)?
    >>
    >> One thing I have noticed, that seems pretty weird, is that the complete backtrace when it crashes during unarchiving is 25962 frames long! Could it simply be that it's running out of memory while trying to unarchive (i.e., on the stack)? If so, how can I get around that? Some sort of caching, perhaps?
    >> The file is only 9.6 MB, so it's not a massive file...
    >
    > 1. What kind of memory-management scheme are you using? Garbage collection, ARC, or retain-release?
    >
    > 2. Assuming (praying) that you're on a version-control system, consider creating a branch and converting the project to ARC. It will make better memory-management choices than you will. See if that helps.
    >
    > 3. Are you accumulating lots of temporary (autoreleased) objects? Can you investigate embedding code inside loops in @autoreleasepool{...} blocks? @autoreleasepool is not an ARC feature. It's usable in any OS target, and ARC / ARC conversion don't create local autorelease pools for you.
    >
    > 4. What does the Allocations template in Instruments tell you?
    >
    > 5. Failing all of that, would you be comfortable posting your -initWithCoder: methods?
    >
    > — F
    >
    >
    > --
    > Fritz Anderson -- Xcode 4 Unleashed: Due 21 May 2012 -- <http://x4u.manoverboard.org/>
    >

    ------------------------------------------------------
    James B. Maxwell
    Composer/Researcher/PhD Candidate
  • On 10/05/2012, at 10:40 AM, James Maxwell wrote:

    > Thanks All,
    >
    > I'm just using retain-release, so I'll look into converting a version to ARC. Unfortunately, I can't go 64-bit as I'm using DrawKit heavily, which doesn't seem to support 64-bit builds.

    While I haven't released DK 64-bit publicly, there's nothing in it that proved to be tricky with 64-bit. If you run the 64-bit code conversion tool over it, it should pretty much work without a hitch. If you do run into problems, let me know privately and I'll help.

    --Graham
  • 25,962 frames on the stack seems to me rather a lot. Yes unarchiving is a very recursive process but that seems pretty deep.

    I missed the start of this thread so I don't know what mechanism you are using to archive and unarchive things, but is it possible you've tripped it up and it's going around a loop in the object graph and creating the same things over and over again (I know the standard archiver/unarchiver classes are coded to avoid that but if you have some code of your own, or have found a bug in them .. ). I think the best advice there was to look at the allocations screen on instruments, sort it by number of objects, it should be very clear very quickly if you expect to have made a few hundred of something whilst unarchiving and you have 20,000 of them.

    You could take a look at the stack trace as well, see if it looks like it's repeating itself, if it does and goes through any of your code then that gives you a place to look.

    On May 10, 2012, at 8:40 AM, James Maxwell wrote:

    > Thanks All,
    >
    > I'm just using retain-release, so I'll look into converting a version to ARC. Unfortunately, I can't go 64-bit as I'm using DrawKit heavily, which doesn't seem to support 64-bit builds. I'll also look into @autoreleasepool in certain places. I do have a tendency to use the factory methods from NSNumber quite often, so there may be cases where I'm creating lots of autoreleased NSNumber objects.
    > However, I have to admit that I'm not sure how to fix the unarchiving process, since the code runs fine while all the learning is happening and the graph is being built. It seems to be the unarchiving process in particular that's a problem. Is there some other way of periodically draining the pool during the unarchiving process?
    >
    > J.
    >
    >
    > On 2012-05-09, at 12:55 PM, Fritz Anderson wrote:
    >
    >> On 9 May 2012, at 1:58 PM, James Maxwell wrote:
    >>
    >>> This morning, I tried enabling Guard Malloc (on its own, without zombies), and was surprised to see the app crash during training, with the following error:
    >>>
    >>> GuardMalloc[ManuScore-2438]: Failed to VM allocate 1864016 bytes
    >>> GuardMalloc[ManuScore-2438]: Explicitly trapping into debugger!!!
    >>>
    >>>
    >>> Is it simply running out of VM while trying to build the graph? If so, why doesn't this happen with Guard Malloc off? Also, with zombies and guard malloc off, why is it only when reading the file that the app crashes, not during training (i.e., while the graph is being built)?
    >>>
    >>> One thing I have noticed, that seems pretty weird, is that the complete backtrace when it crashes during unarchiving is 25962 frames long! Could it simply be that it's running out of memory while trying to unarchive (i.e., on the stack)? If so, how can I get around that? Some sort of caching, perhaps?
    >>> The file is only 9.6 MB, so it's not a massive file...
    >>
    >> 1. What kind of memory-management scheme are you using? Garbage collection, ARC, or retain-release?
    >>
    >> 2. Assuming (praying) that you're on a version-control system, consider creating a branch and converting the project to ARC. It will make better memory-management choices than you will. See if that helps.
    >>
    >> 3. Are you accumulating lots of temporary (autoreleased) objects? Can you investigate embedding code inside loops in @autoreleasepool{...} blocks? @autoreleasepool is not an ARC feature. It's usable in any OS target, and ARC / ARC conversion don't create local autorelease pools for you.
    >>
    >> 4. What does the Allocations template in Instruments tell you?
    >>
    >> 5. Failing all of that, would you be comfortable posting your -initWithCoder: methods?
    >>
    >> — F
    >>
    >>
    >> --
    >> Fritz Anderson -- Xcode 4 Unleashed: Due 21 May 2012 -- <http://x4u.manoverboard.org/>
    >>
    >
    > ------------------------------------------------------
    > James B. Maxwell
    > Composer/Researcher/PhD Candidate
  • Yes, I agree about 25k frames being pretty big. It's possible though, because the structure being unarchived is pretty complex... I decided to put an NSLog(@"Unarchiving %p", self) at the line where the crash finally occurs in initWithCoder of my CbCMNode class. Pasting the output into TextWrangler and searching for any given object returns only a single match, so I'm guessing there's no actual looping going on (but maybe there's a better way to check?). Interestingly, I just ran the Allocations instrument and it opened the file just fine. Poking a little more I noticed that profiling was running the release build, which is on -O2 optimization, and seems to be able to load... Of course, this will probably break as well, if I do more training.
    Unfortunately, ARC is going to be awkward with this project, as I'm getting loads of complaints from the ARC migration macro (not so many in my own code, but a few from DrawKit, which is used heavily in my app). But, either way, I think I have to figure out how to get the memory down. What seems strange to me is that, once the app is open, it's only taking 50 MB in real memory -- less than Mail, the Finder, Safari... So it's not a huge amount of memory. I think it's just the fact that the stack gets filled **during** the unarchiving process. But surely there must be a way around this. No?

    J.

    On 2012-05-09, at 7:11 PM, Roland King wrote:

    > 25,962 frames on the stack seems to me rather a lot. Yes unarchiving is a very recursive process but that seems pretty deep.
    >
    > I missed the start of this thread so I don't know what mechanism you are using to archive and unarchive things, but is it possible you've tripped it up and it's going around a loop in the object graph and creating the same things over and over again (I know the standard archiver/unarchiver classes are coded to avoid that but if you have some code of your own, or have found a bug in them .. ). I think the best advice there was to look at the allocations screen on instruments, sort it by number of objects, it should be very clear very quickly if you expect to have made a few hundred of something whilst unarchiving and you have 20,000 of them.
    >
    > You could take a look at the stack trace as well, see if it looks like it's repeating itself, if it does and goes through any of your code then that gives you a place to look.
    >
    > On May 10, 2012, at 8:40 AM, James Maxwell wrote:
    >
    >> Thanks All,
    >>
    >> I'm just using retain-release, so I'll look into converting a version to ARC. Unfortunately, I can't go 64-bit as I'm using DrawKit heavily, which doesn't seem to support 64-bit builds. I'll also look into @autoreleasepool in certain places. I do have a tendency to use the factory methods from NSNumber quite often, so there may be cases where I'm creating lots of autoreleased NSNumber objects.
    >> However, I have to admit that I'm not sure how to fix the unarchiving process, since the code runs fine while all the learning is happening and the graph is being built. It seems to be the unarchiving process in particular that's a problem. Is there some other way of periodically draining the pool during the unarchiving process?
    >>
    >> J.
    >>
    >>
    >> On 2012-05-09, at 12:55 PM, Fritz Anderson wrote:
    >>
    >>> On 9 May 2012, at 1:58 PM, James Maxwell wrote:
    >>>
    >>>> This morning, I tried enabling Guard Malloc (on its own, without zombies), and was surprised to see the app crash during training, with the following error:
    >>>>
    >>>> GuardMalloc[ManuScore-2438]: Failed to VM allocate 1864016 bytes
    >>>> GuardMalloc[ManuScore-2438]: Explicitly trapping into debugger!!!
    >>>>
    >>>>
    >>>> Is it simply running out of VM while trying to build the graph? If so, why doesn't this happen with Guard Malloc off? Also, with zombies and guard malloc off, why is it only when reading the file that the app crashes, not during training (i.e., while the graph is being built)?
    >>>>
    >>>> One thing I have noticed, that seems pretty weird, is that the complete backtrace when it crashes during unarchiving is 25962 frames long! Could it simply be that it's running out of memory while trying to unarchive (i.e., on the stack)? If so, how can I get around that? Some sort of caching, perhaps?
    >>>> The file is only 9.6 MB, so it's not a massive file...
    >>>
    >>> 1. What kind of memory-management scheme are you using? Garbage collection, ARC, or retain-release?
    >>>
    >>> 2. Assuming (praying) that you're on a version-control system, consider creating a branch and converting the project to ARC. It will make better memory-management choices than you will. See if that helps.
    >>>
    >>> 3. Are you accumulating lots of temporary (autoreleased) objects? Can you investigate embedding code inside loops in @autoreleasepool{...} blocks? @autoreleasepool is not an ARC feature. It's usable in any OS target, and ARC / ARC conversion don't create local autorelease pools for you.
    >>>
    >>> 4. What does the Allocations template in Instruments tell you?
    >>>
    >>> 5. Failing all of that, would you be comfortable posting your -initWithCoder: methods?
    >>>
    >>> — F
    >>>
    >>>
    >>> --
    >>> Fritz Anderson -- Xcode 4 Unleashed: Due 21 May 2012 -- <http://x4u.manoverboard.org/>
    >>>
    >>
    >> ------------------------------------------------------
    >> James B. Maxwell
    >> Composer/Researcher/PhD Candidate
    >

    ------------------------------------------------------
    James B. Maxwell
    Composer/Researcher/PhD Candidate
  • Okay, so I'm back to trying to tackle this annoying unarchiving crash…

    Just to recap the problem: I get a exc_bad_access crash when unarchiving certain files from disk. The file is a keyed archive, which contains a fairly complex custom object graph, with plenty of circular references (i.e., parentNode <---> childNode stuff). When this graph is relatively small I have no problems. But when it gets larger, it crashes. As mentioned previously, one seemingly significant thing about the crash is that the backtrace is >25,000 frames long. I've taken this to suggest that perhaps: A) some circular reference is getting stuck in a loop, or B) the graph is large enough that, while the unarchiver is trying to keep track of circular references, the stack overflows. I don't know if either of these possibilities makes sense, so I'm wondering how I might test for each?

    Partly because the massive backtrace isn't just a list of identical calls, and partly because the unarchiver is supposed to handle circular references, I kind of suspect B. But, if this is the case, how can I get around it? I already use archiveRootObject:toFile for archiving, so I would think I should be exploiting the built-in recursion checking stuff… Accordingly, I use unarchiveObjectWithFile to unarchive the graph. Everything I've done is pretty basic stuff, so perhaps my structure calls for a more advanced approach(??) I did include @autoreleasepool blocks in a couple of places, where  temporary objects could be created during initialization. But that didn't help…

    So, I guess I'm wondering whether anyone else has had similar problems, and how you went about solving them. I should also mention that the file itself is very big - even a 13 MB file will cause the crash.

    By the way, it's not the super common failure to retain the unarchived object… Also, NSZombies and Guard Malloc don't show any obvious problems, and the static analyzer shows no memory errors.

    Any further thoughts greatly appreciated.

    J.

    James B Maxwell
    Composer/Doctoral Candidate
    School for the Contemporary Arts (SCA)
    School for Interactive Arts + Technology (SIAT)
    Simon Fraser University
  • On May 28, 2012, at 15:14 , James Maxwell wrote:

    > Just to recap the problem: I get a exc_bad_access crash when unarchiving certain files from disk. The file is a keyed archive, which contains a fairly complex custom object graph, with plenty of circular references (i.e., parentNode <---> childNode stuff). When this graph is relatively small I have no problems. But when it gets larger, it crashes. As mentioned previously, one seemingly significant thing about the crash is that the backtrace is >25,000 frames long. I've taken this to suggest that perhaps: A) some circular reference is getting stuck in a loop, or B) the graph is large enough that, while the unarchiver is trying to keep track of circular references, the stack overflows. I don't know if either of these possibilities makes sense, so I'm wondering how I might test for each?
    >
    > Partly because the massive backtrace isn't just a list of identical calls, and partly because the unarchiver is supposed to handle circular references, I kind of suspect B. But, if this is the case, how can I get around it? I already use archiveRootObject:toFile for archiving, so I would think I should be exploiting the built-in recursion checking stuff… Accordingly, I use unarchiveObjectWithFile to unarchive the graph. Everything I've done is pretty basic stuff, so perhaps my structure calls for a more advanced approach(??) I did include @autoreleasepool blocks in a couple of places, where  temporary objects could be created during initialization. But that didn't help…

    I think you're approaching this incorrectly.

    At best, you're stating the problem neutrally ("is getting stuck in a loop") as if uncertain whether to blame the frameworks (NSKeyedUnarchiver specifically) or your own code. This is a mistake. You *must* assume that you're doing something wrong, *until and unless* you find specific evidence that the frameworks are at fault.

    It's not that there can't be bugs in Cocoa frameworks. There are plenty. Rather, the rationale is that classes like NSKeyedUnarchiver have been used innumerable times, while your code has been used successfully -- how many times?

    Next, it's not remotely credible that NSKeyedUnarchiver is designed to recurse in a way that could put 25,000 frame on the stack. This would mean, for example, that NSKeyedUnarchiver wasn't usable in background threads, which by default have a *very* small stack.

    Occam's Razor says that you have a bug in your code.

    My guess is, given the backtrace you originally posted, that you've somehow added an object to one of its own array instance variables. That could well put NSKeyedUnarchiver into an infinite tailspin. (This would not be "parentNode <---> childNode stuff". It would be claiming that an object is its own child, which is a relationship you can't expect unarchiving to deal with, if you think about the order in which things happen.) The backtrace even appears to tell you the class of the object that's doing this: CbCMNode. According to the backtrace, this object is unarchiving an array object key, and that unarchives an instance of CbCMNode, which unarchives … .

    You might have an easier time of detecting this during archiving, rather than unarchiving. How about you add something like this to [CbCMNode encodeWithCoder:]:

    NSAssert (![self->whateverArray containsObjectIdenticalTo: self], @"Er, this shouldn't happen");

    Depending on the bug in your code, it might be a bit harder to find out where it's going wrong. However, as long as there's a small voice in your mind trying to throw the blame on the frameworks, you won't get anywhere.
  • Thanks, Quincey.

    Well, I've revisited this problem many times over the past year, or so (obviously, not on a daily, or even weekly basis, but the problem has been lurking for a long time, unresolved). I've gone over the code in detail literally hundreds of times  looking for the kind of problem you describe. In the parent/child relationships, the CbCMNode class has methods for "addChild" and addParent" which specifically test for identity between the node being added and self, so that a node's "childNodes" array, for example, can never contain the node itself. Nevertheless, for the sake of thoroughness, I did just place the NSAsserts you recommended in encodeWithCoder, but to no avail.

    The only reason I've begun to even vaguely questioned the framework -- honestly, for the first time today -- is because I've read a number of threads today that talked about potential problems in NSKeyedUnarchiver when dealing with large, circular graphs -- one of which was actually posted by a developer from Apple (can't recall who now). Also, Mike Ash's discussions on the subject didn't help my confidence particularly...

    Further, the fact that the size of the archive is a reproducible factor in the crashes seems odd. The nature of the graphs is such that size shouldn't influence the topology in any significant way (at least not considering the size and complexity of some of the graphs that do load without problems). There may be more nodes in the larger graphs, but their connectivity is fundamentally the same as with small graphs. So any obvious problem, like adding "self" to a "childNodes" array, would almost certainly show up in a smaller graph as well. Having said that, it is a complex structure, so I'll go over it again…

    But in response to your general suggestion that I'm somehow immediately jumping to blaming the frameworks as a first course of action, that honestly couldn't be further from the truth.

    Again, if anybody has had a similar experience, or has any further thoughts, any help would be appreciated.

    J.

    On 2012-05-28, at 7:58 PM, Quincey Morris wrote:

    > On May 28, 2012, at 15:14 , James Maxwell wrote:
    >
    >> Just to recap the problem: I get a exc_bad_access crash when unarchiving certain files from disk. The file is a keyed archive, which contains a fairly complex custom object graph, with plenty of circular references (i.e., parentNode <---> childNode stuff). When this graph is relatively small I have no problems. But when it gets larger, it crashes. As mentioned previously, one seemingly significant thing about the crash is that the backtrace is >25,000 frames long. I've taken this to suggest that perhaps: A) some circular reference is getting stuck in a loop, or B) the graph is large enough that, while the unarchiver is trying to keep track of circular references, the stack overflows. I don't know if either of these possibilities makes sense, so I'm wondering how I might test for each?
    >>
    >> Partly because the massive backtrace isn't just a list of identical calls, and partly because the unarchiver is supposed to handle circular references, I kind of suspect B. But, if this is the case, how can I get around it? I already use archiveRootObject:toFile for archiving, so I would think I should be exploiting the built-in recursion checking stuff… Accordingly, I use unarchiveObjectWithFile to unarchive the graph. Everything I've done is pretty basic stuff, so perhaps my structure calls for a more advanced approach(??) I did include @autoreleasepool blocks in a couple of places, where  temporary objects could be created during initialization. But that didn't help…
    >
    > I think you're approaching this incorrectly.
    >
    > At best, you're stating the problem neutrally ("is getting stuck in a loop") as if uncertain whether to blame the frameworks (NSKeyedUnarchiver specifically) or your own code. This is a mistake. You *must* assume that you're doing something wrong, *until and unless* you find specific evidence that the frameworks are at fault.
    >
    > It's not that there can't be bugs in Cocoa frameworks. There are plenty. Rather, the rationale is that classes like NSKeyedUnarchiver have been used innumerable times, while your code has been used successfully -- how many times?
    >
    > Next, it's not remotely credible that NSKeyedUnarchiver is designed to recurse in a way that could put 25,000 frame on the stack. This would mean, for example, that NSKeyedUnarchiver wasn't usable in background threads, which by default have a *very* small stack.
    >
    > Occam's Razor says that you have a bug in your code.
    >
    > My guess is, given the backtrace you originally posted, that you've somehow added an object to one of its own array instance variables. That could well put NSKeyedUnarchiver into an infinite tailspin. (This would not be "parentNode <---> childNode stuff". It would be claiming that an object is its own child, which is a relationship you can't expect unarchiving to deal with, if you think about the order in which things happen.) The backtrace even appears to tell you the class of the object that's doing this: CbCMNode. According to the backtrace, this object is unarchiving an array object key, and that unarchives an instance of CbCMNode, which unarchives … .
    >
    > You might have an easier time of detecting this during archiving, rather than unarchiving. How about you add something like this to [CbCMNode encodeWithCoder:]:
    >
    > NSAssert (![self->whateverArray containsObjectIdenticalTo: self], @"Er, this shouldn't happen");
    >
    > Depending on the bug in your code, it might be a bit harder to find out where it's going wrong. However, as long as there's a small voice in your mind trying to throw the blame on the frameworks, you won't get anywhere.
    >
    >

    James B Maxwell
    Composer/Doctoral Candidate
    School for the Contemporary Arts (SCA)
    School for Interactive Arts + Technology (SIAT)
    Simon Fraser University
  • On May 28, 2012, at 20:48 , James Maxwell wrote:

    > The only reason I've begun to even vaguely questioned the framework -- honestly, for the first time today -- is because I've read a number of threads today that talked about potential problems in NSKeyedUnarchiver when dealing with large, circular graphs -- one of which was actually posted by a developer from Apple (can't recall who now). Also, Mike Ash's discussions on the subject didn't help my confidence particularly…

    > But in response to your general suggestion that I'm somehow immediately jumping to blaming the frameworks as a first course of action, that honestly couldn't be further from the truth.

    I think I was referring to that inner voice that *tempts* you to blame, not any actual blaming. :)

    We're probably at the point where you need to start showing code, at least in a reduced version, for encodeWithCoder and initWithCoder.

    Incidentally, assuming each CbCMNode instance has a child array and a parent pointer, you aren't trying to to archive both, are you? I wouldn't expect that to work.
  • >
    > I think I was referring to that inner voice that *tempts* you to blame, not any actual blaming. :)

    Sure, understood.
    >
    > We're probably at the point where you need to start showing code, at least in a reduced version, for encodeWithCoder and initWithCoder.
    >

    - (void)            encodeWithCoder:(NSCoder *)aCoder
    {
        [aCoder encodeObject:eventKey forKey:@"eventKey"];
        [aCoder encodeObject:value forKey:@"value"];
        [aCoder encodeInt:type forKey:@"type"];
        [aCoder encodeInt:state forKey:@"state"];
        [aCoder encodeFloat:closure forKey:@"closure"];
        [aCoder encodeInt:depth forKey:@"depth"];
        [aCoder encodeInt:count forKey:@"count"];
        [aCoder encodeFloat:probability forKey:@"probability"];
        [aCoder encodeObject:schemaNode forKey:@"schemaNode"];
        [aCoder encodeObject:parentNodes forKey:@"parentNodes"];
        [aCoder encodeObject:childNodes forKey:@"childNodes"];
        [aCoder encodeObject:targetNodes forKey:@"targetNodes"];
        [aCoder encodeObject:targetEdgeCounts forKey:@"targetEdgeCounts"];
        [aCoder encodeObject:sourceNodes forKey:@"sourceNodes"];
        [aCoder encodeObject:superState forKey:@"superState"];
        [aCoder encodeObject:subStates forKey:@"subStates"];
    }

    - (id)              initWithCoder:(NSCoder *)aDecoder
    {
        if((self = [super init]))
        {
            eventKey = [[aDecoder decodeObjectForKey:@"eventKey"] retain];
            value = [[aDecoder decodeObjectForKey:@"value"] retain];
            type = [aDecoder decodeIntForKey:@"type"];
            state = [aDecoder decodeIntForKey:@"state"];
            closure = [aDecoder decodeFloatForKey:@"closure"];
            depth = [aDecoder decodeIntForKey:@"depth"];
            count = [aDecoder decodeIntForKey:@"count"];
            probability = [aDecoder decodeFloatForKey:@"probability"];
            schemaNode = [[aDecoder decodeObjectForKey:@"schemaNode"] retain];
            parentNodes = [[aDecoder decodeObjectForKey:@"parentNodes"] retain];
            childNodes = [[aDecoder decodeObjectForKey:@"childNodes"] retain];
            targetNodes = [[aDecoder decodeObjectForKey:@"targetNodes"] retain];
            targetEdgeCounts = [[aDecoder decodeObjectForKey:@"targetEdgeCounts"] retain];
            sourceNodes = [[aDecoder decodeObjectForKey:@"sourceNodes"] retain];
            superState = [[aDecoder decodeObjectForKey:@"superState"] retain];
            subStates = [[aDecoder decodeObjectForKey:@"subStates"] retain];
        }
        return self;
    }

    CbCMNode is a NSObject subclass, btw.

    > Incidentally, assuming each CbCMNode instance has a child array and a parent pointer, you aren't trying to to archive both, are you? I wouldn't expect that to work.

    Well, yes, that's what happens. In fact, it's much hairier than that! There's actually an array of parentNodes, not just one. It's a complex graph, as I mentioned, not a straightforward tree (which would already contain mutual parent/child references, it seems to me). In my structure, there are about three dimensions of this kind of referencing, allowing me to move around the graph in useful ways… It's encoding hierarchical musical structure, for a music learning/generation algorithm, so there's a lot of complexity involved. As I say, it loads fine with smaller files, and testing the structure after loading indicates that it has been reconstructed properly. This is why I suspect there's a problem with trying to keep track of all these relationships during unarchiving, which NSKeyedUnarchiver does on the stack, afaik. There's an interesting discussion on this here:

    http://forum.soft32.com/mac/Archiving-large-graphs-part-II-ftopict44343.htm
    l


    It's a very old thread, from 2004, but it does seem to be addressing similar problems. And I think these cases are rare enough that Apple wouldn't necessarily put a bunch of time into improving them… besides, Michael Ash's comments seem to suggest that they really couldn't improve matters anyway, within the overall design of the NSCoder model. I just wondered if, perhaps, there was a way to get the whole thing to happen on the heap, rather than the stack, so that it could complete successfully. It would be slow, but I could deal with that for the time being (i.e., until I get my doctorate finished, after which time I could rip it apart and try a different approach!).
    I did try, btw, using encodeConditionalObject for parentNodes, sourceNodes, and superState, all of which are CbCMNodes. But the structure was no longer intact after trying this (parent connections gone), so I think I misunderstood how conditional objects are supposed to work...

    J.

    James B Maxwell
    Composer/Doctoral Candidate
    School for the Contemporary Arts (SCA)
    School for Interactive Arts + Technology (SIAT)
    Simon Fraser University
  • On May 28, 2012, at 10:08 PM, James Maxwell wrote:

    > Well, yes, that's what happens. In fact, it's much hairier than that! There's actually an array of parentNodes, not just one. It's a complex graph, as I mentioned, not a straightforward tree (which would already contain mutual parent/child references, it seems to me). In my structure, there are about three dimensions of this kind of referencing, allowing me to move around the graph in useful ways…

    I suspect this isn't a bug in Foundation, so much as a scaling limit that you've exceeded with the complexity of the object graphs you write out. You've inadvertently found a way to get the graph traversal done by the unarchiver to recurse to arbitrarily deep levels, by generating graphs that have arbitrarily long non-cyclic paths through them.

    I suggest writing out simpler object graphs.

    One place you can start is by _not_ archiving the back-pointers. You can reconstruct those in the -initWithCoder method. For example, after a parent node unarchives its array of children, it can call each child to tell it, in effect, "who's your daddy?" so the child can initialize its parent pointer.

    In general, look at your graph and figure out the minimum number of object relations you need to archive to reconstruct its structure. Then archive only those, and recreate the rest at load time.

    (Another possibility is to work around this by creating a pthread with a really huge stack size limit, and doing the unarchiving on that thread. But I bet it'll be slow. Your users would probably rather you simplified the object graph and made opening documents faster.)

    —Jens
  • On May 28, 2012, at 10:51 PM, I wrote:

    > In general, look at your graph and figure out the minimum number of object relations you need to archive to reconstruct its structure. Then archive only those, and recreate the rest at load time.

    I just had another thought. Are you using linked lists? I suspect those are rather bad for the unarchiver, since it's likely to end up recursing all the way down the list, resulting in O(n) stack depth. That is, during -initWithCoder: for an item in the list, it'll be asked to unarchive the "next" property, which ends up calling -initWithCoder: for the next item in the list, and so on until it hits the end and can finally unwind the stack.

    If so, it would be a lot more efficient for the archiver if you stored the list as an NSArray, since that allows it to instantiate one item at a time (breadth-first instead of depth-first, basically.)

    —Jens
  • On 29/05/2012, at 3:08 PM, James Maxwell wrote:

    > I did try, btw, using encodeConditionalObject for parentNodes, sourceNodes, and superState, all of which are CbCMNodes. But the structure was no longer intact after trying this (parent connections gone), so I think I misunderstood how conditional objects are supposed to work...

    That sounds like it could be a big clue, to me.

    encodeConditionalObject only encodes  the object reference if it has been seen by the archiver already. If it hasn't, nil is encoded. So if things are breaking when you use it, it means that parts of the object graph you think you encoded were not.

    It's pretty useful for archiving backpointers (though I agree that recreating these rather than archiving them is probably safer).

    Assuming that your graph has a root node, archiving from root to leaves should be OK, but you should use encodeConditionalObject for any object you think you should have already archived (like parent nodes).

    Another issue could be that if you are storing your parent and child nodes in an array, you probably have pretty solid retain cycles all over the place. Backpointers should almost always be weak (nonretained) references, which calls for special treatment if you want to put them in an array (use [NSPointerArray pointerArrayWithWeakObjects]).

    --Graham
  • On May 28, 2012, at 22:08 , James Maxwell wrote:

    > Well, yes, that's what happens. In fact, it's much hairier than that! There's actually an array of parentNodes, not just one. It's a complex graph, as I mentioned, not a straightforward tree (which would already contain mutual parent/child references, it seems to me). In my structure, there are about three dimensions of this kind of referencing, allowing me to move around the graph in useful ways… It's encoding hierarchical musical structure, for a music learning/generation algorithm, so there's a lot of complexity involved. As I say, it loads fine with smaller files, and testing the structure after loading indicates that it has been reconstructed properly. This is why I suspect there's a problem with trying to keep track of all these relationships during unarchiving, which NSKeyedUnarchiver does on the stack, afaik. There's an interesting discussion on this here:

    I think have been stupid and still am stupid.

    (Still stupid because I don't understand how unarchiving can *validly* resolve mutual references without returning an incompletely-initialized -- in the sense of not yet having "returned self" -- object to at least one of the referrers. But perhaps there's an inherent/implied/undocumented restriction on the kinds of shenanigans that 'initWithCoder:' is allowed to get up to.)

    That aside, I have a suspicion, or perhaps it's just wild speculation, that NSKeyedUnarchiver has some built-in safeguards that mean it can't just (in effect) walk your object graph visiting each node once, but needs (in effect) to walk every path through the graph, so that it actually visits nodes more than once. And it does this through recursion. For a strictly hierarchical object graph, there's no difference between the two strategies, but certain other kinds of object graphs will take a lot of stack space and (even if it doesn't overflow the stack) will perform very badly.

    Whatever the explanation, I think the solution is still not to archive both parent and child pointers. Instead of this:

    > [aCoder encodeObject:parentNodes forKey:@"parentNodes"];
    > [aCoder encodeObject:childNodes forKey:@"childNodes"];

    just this:

    > [aCoder encodeObject:childNodes forKey:@"childNodes"];

    and instead of this:

    > parentNodes = [[aDecoder decodeObjectForKey:@"parentNodes"] retain];
    > childNodes = [[aDecoder decodeObjectForKey:@"childNodes"] retain];

    something like this:

    > parentNodes = [[NSMutableArray alloc] init];
    > childNodes = [[aDecoder decodeObjectForKey:@"childNodes"] retain];
    > for (CbCMNode* node in childNodes)
    > [node->parentNodes addObject: node];

    If that's not satisfactory, then I think what you need to do is something like what I imagine Core Data does. Instead of archiving object pointers (at least for the potentially mutual references such as parents and children), assign and archive UUIDs, and archive a global dictionary whose keys are the UUIDs and whose objects are the CbCMNode instances. After you've finished unarchiving the object graph, walk it one more time yourself, replacing parent/child UUIDs with the corresponding objects as you go.

    Regardless of whether your crash is a bug, a design defect, a limitation, or whatever, one of these approaches should relieve NSKeyedUnarchiver of the need to deal with complicated object graphs.
  • Hi All,

    Yes, I think this general approach of trying to find a way of making reciprocal connections in initWithCoder, rather than archiving all connections, makes a great deal of sense. Quite honestly, this is an implementation of a theoretical model that is, in every conceivable way, a prototype. The priority was to figure out how to build the data structure and get it working as a musical representation. And it hasn't been easy by any stretch of the imagination. Improving and optimizing the implementation is a secondary goal and, while important, is not the focus of my research. That said, I do need to get it working for larger data sets, so I'm going to have to find some way around the current problem.

    The other option of reading all the nodes as UUIDs, loading them, and making the connections later is also a good option, and quite possibly the safest. As I say, the last thing I want to do is break the structure, which has taken a long time to get right (and by "right" I mean as a music representation, not a software implementation). Since the structure is learned from musical examples, I'd rather work with it as it is, and just figure out a way of encoding/decoding it that works.

    Thanks for your help, everybody.

    J.

    On 2012-05-28, at 11:39 PM, Quincey Morris wrote:

    > On May 28, 2012, at 22:08 , James Maxwell wrote:
    >
    >> Well, yes, that's what happens. In fact, it's much hairier than that! There's actually an array of parentNodes, not just one. It's a complex graph, as I mentioned, not a straightforward tree (which would already contain mutual parent/child references, it seems to me). In my structure, there are about three dimensions of this kind of referencing, allowing me to move around the graph in useful ways… It's encoding hierarchical musical structure, for a music learning/generation algorithm, so there's a lot of complexity involved. As I say, it loads fine with smaller files, and testing the structure after loading indicates that it has been reconstructed properly. This is why I suspect there's a problem with trying to keep track of all these relationships during unarchiving, which NSKeyedUnarchiver does on the stack, afaik. There's an interesting discussion on this here:
    >
    > I think have been stupid and still am stupid.
    >
    > (Still stupid because I don't understand how unarchiving can *validly* resolve mutual references without returning an incompletely-initialized -- in the sense of not yet having "returned self" -- object to at least one of the referrers. But perhaps there's an inherent/implied/undocumented restriction on the kinds of shenanigans that 'initWithCoder:' is allowed to get up to.)
    >
    > That aside, I have a suspicion, or perhaps it's just wild speculation, that NSKeyedUnarchiver has some built-in safeguards that mean it can't just (in effect) walk your object graph visiting each node once, but needs (in effect) to walk every path through the graph, so that it actually visits nodes more than once. And it does this through recursion. For a strictly hierarchical object graph, there's no difference between the two strategies, but certain other kinds of object graphs will take a lot of stack space and (even if it doesn't overflow the stack) will perform very badly.
    >
    > Whatever the explanation, I think the solution is still not to archive both parent and child pointers. Instead of this:
    >
    >> [aCoder encodeObject:parentNodes forKey:@"parentNodes"];
    >> [aCoder encodeObject:childNodes forKey:@"childNodes"];
    >
    > just this:
    >
    >> [aCoder encodeObject:childNodes forKey:@"childNodes"];
    >
    > and instead of this:
    >
    >> parentNodes = [[aDecoder decodeObjectForKey:@"parentNodes"] retain];
    >> childNodes = [[aDecoder decodeObjectForKey:@"childNodes"] retain];
    >
    > something like this:
    >
    >> parentNodes = [[NSMutableArray alloc] init];
    >> childNodes = [[aDecoder decodeObjectForKey:@"childNodes"] retain];
    >> for (CbCMNode* node in childNodes)
    >> [node->parentNodes addObject: node];
    >
    > If that's not satisfactory, then I think what you need to do is something like what I imagine Core Data does. Instead of archiving object pointers (at least for the potentially mutual references such as parents and children), assign and archive UUIDs, and archive a global dictionary whose keys are the UUIDs and whose objects are the CbCMNode instances. After you've finished unarchiving the object graph, walk it one more time yourself, replacing parent/child UUIDs with the corresponding objects as you go.
    >
    > Regardless of whether your crash is a bug, a design defect, a limitation, or whatever, one of these approaches should relieve NSKeyedUnarchiver of the need to deal with complicated object graphs.
    >
    >

    James B Maxwell
    Composer/Doctoral Candidate
    School for the Contemporary Arts (SCA)
    School for Interactive Arts + Technology (SIAT)
    Simon Fraser University
  • >
    > That sounds like it could be a big clue, to me.
    >
    > encodeConditionalObject only encodes  the object reference if it has been seen by the archiver already. If it hasn't, nil is encoded. So if things are breaking when you use it, it means that parts of the object graph you think you encoded were not.
    >
    > It's pretty useful for archiving backpointers (though I agree that recreating these rather than archiving them is probably safer).
    >
    > Assuming that your graph has a root node, archiving from root to leaves should be OK, but you should use encodeConditionalObject for any object you think you should have already archived (like parent nodes).
    >
    > Another issue could be that if you are storing your parent and child nodes in an array, you probably have pretty solid retain cycles all over the place. Backpointers should almost always be weak (nonretained) references, which calls for special treatment if you want to put them in an array (use [NSPointerArray pointerArrayWithWeakObjects]).

    Thanks, Graham. This is a good thought. I've not heard of NSPointerArray before… I'll look into it. It sounds like exactly the kind of structure I should be using.

    J.
previous month may 2012 next month
MTWTFSS
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
Go to today