CFRunLookFindMode crash

  • I've pared down my multithreading code even further. I'm not sure
    whether I have one problem or two, but something that I've managed to
    isolate is a crash in CFRunLoopFindMode

    I can get this both in my application, with only a handful of
    threads, and also in a test program, which creates blocks of 200
    threads at a time, using my multithreading utilities (which do things
    like NSAutoRelease pool handling).

    It seems to crash with a SIGTRAP.

    I saw a few previous references to this, e.g. http://
    www.cocoabuilder.com/archive/message/cocoa/2004/8/4/113693

    And R. Tyler Ballance's post and Radar no #4563964

    Does anyone else have any info or tips on this?
  • On 14 nov 2006, at 05.43, Martin Redington wrote:

    > I've pared down my multithreading code even further. I'm not sure
    > whether I have one problem or two, but something that I've managed
    > to isolate is a crash in CFRunLoopFindMode
    >
    > I can get this both in my application, with only a handful of
    > threads, and also in a test program, which creates blocks of 200
    > threads at a time, using my multithreading utilities (which do
    > things like NSAutoRelease pool handling).

    Perhaps you could post the source to your sample application
    somewhere? Include instructions on how to reproduce the crash.

    Given the report from R. Tyler Ballance (that this crash also happens
    in applications from Apple), this could of course very well be a bug
    in NS/CFRunLoop.

    j o a r
  • On Nov 13, 2006, at 10:57 PM, j o a r wrote:

    >
    > On 14 nov 2006, at 05.43, Martin Redington wrote:
    >
    >> I've pared down my multithreading code even further. I'm not sure
    >> whether I have one problem or two, but something that I've managed
    >> to isolate is a crash in CFRunLoopFindMode
    >>
    >> I can get this both in my application, with only a handful of
    >> threads, and also in a test program, which creates blocks of 200
    >> threads at a time, using my multithreading utilities (which do
    >> things like NSAutoRelease pool handling).
    >
    >
    > Perhaps you could post the source to your sample application
    > somewhere? Include instructions on how to reproduce the crash.
    >
    > Given the report from R. Tyler Ballance (that this crash also
    > happens in applications from Apple), this could of course very well
    > be a bug in NS/CFRunLoop.

    Woohoo! I'm being referenced on cocoadev, I'm halfway to famous!
    Anyways, the CFRunLookFindMode/CFRunLoop crash that I came across is
    somewhat in limbo right now.

    From the best I, and an Apple QA Engineer, could diagnose, something
    was fiddling with the Core Foundation shared library space. While I
    still have a sneaking suspicion that there is a bug lurking around in
    there, the main problem that was causing my issues was that Chax[1]
    was doing something nasty that it shouldn't have been doing, and thus
    crashing Core Foundation, and taking some of Apple's applications
    that were intimately tied to CF at the time along with it (in my blog
    posting, Preview and Mail.app)

    While the bug in radar may or may not be closed right now, I still
    _think_ there is an issue in CFRunLoop, so if you could post sample
    code to where we could isolate the bug I would be more than happy to
    pitch in and help break it :)

    Of course, you could just give up on CFRunLoop and put all your code
    into a big ol' while (true) {}.... just a thought :-P

    Cheers

    [1] http://www.ksuther.com/chax/

    R. Tyler Ballance: Lead Mac Developer at bleep. software
    contact: <tyler...> | jabber: <tyler...>
  • On 14 Nov 2006, at 15:56, R. Tyler Ballance wrote:

    > Woohoo! I'm being referenced on cocoadev, I'm halfway to famous!
    > Anyways, the CFRunLookFindMode/CFRunLoop crash that I came across
    > is somewhat in limbo right now.
    >
    > From the best I, and an Apple QA Engineer, could diagnose,
    > something was fiddling with the Core Foundation shared library
    > space. While I still have a sneaking suspicion that there is a bug
    > lurking around in there, the main problem that was causing my
    > issues was that Chax[1] was doing something nasty that it shouldn't
    > have been doing, and thus crashing Core Foundation, and taking some
    > of Apple's applications that were intimately tied to CF at the time
    > along with it (in my blog posting, Preview and Mail.app)

    That's interesting. My dev system is pretty stock, but I have seen
    Mail.app go down quite a few times while I've been debugging. I was
    already beginning to suspect some connection.

    I can reliably replicate the issue (and another possible multi-
    threading issue with NSUserDefaults that I posted about earlier).

    > While the bug in radar may or may not be closed right now, I still
    > _think_ there is an issue in CFRunLoop, so if you could post sample
    > code to where we could isolate the bug I would be more than happy
    > to pitch in and help break it :)

    Currently, I'm trying to replicate the issue by modifying the
    SimpleThreads example. My Multi-threading code was based on that, and
    if I can cleanly replicate there, then it saves me paring off more
    cruft and/or proprietary code from my test case.

    > Of course, you could just give up on CFRunLoop and put all your
    > code into a big ol' while (true) {}.... just a thought :-P

    if only

        cheers,
              m.
  • On 14 nov 2006, at 17.53, Martin Redington wrote:

    >> From the best I, and an Apple QA Engineer, could diagnose,
    >> something was fiddling with the Core Foundation shared library
    >> space. While I still have a sneaking suspicion that there is a bug
    >> lurking around in there, the main problem that was causing my
    >> issues was that Chax[1] was doing something nasty that it
    >> shouldn't have been doing, and thus crashing Core Foundation, and
    >> taking some of Apple's applications that were intimately tied to
    >> CF at the time along with it (in my blog posting, Preview and
    >> Mail.app)
    >
    > That's interesting. My dev system is pretty stock, but I have seen
    > Mail.app go down quite a few times while I've been debugging. I was
    > already beginning to suspect some connection.

    My suggestion: Before you do anything else, make absolutely sure that
    you haven't got any system extensions, haxies, Input Managers or
    other malware installed on your machine!

    > I can reliably replicate the issue (and another possible multi-
    > threading issue with NSUserDefaults that I posted about earlier).

    Post the code and let others take a look.

    j o a r
  • ok. I've posted SimpleThreadsCrashes.dmg at

    http://www.mildmanneredindustries.com/downloads/SimpleThreadsCrash.dmg

    This contains a modified version of Apple's SimpleThreads example. To
    see the problem, run it, and hit the "Lots of Threads" button.

    This will create a block of 200 threads, sleep for four seconds, (try
    to) destroy the threads. It will try and run 10 such blocks. The
    threads don't try to do anything. Wait a while, and it will crash.

    Note that although large numbers of threads are required in the
    example, to generate the Crash, in our app we can do this with just
    two threads.

    For best results, run it in the Debugger, as it will often take XCode
    down if you just "run" it from XCode. Use Activity Monitor to observe
    thread creation and destruction.

    I've commented out lots of the original SimpleThreads code that isn't
    used here.

    As posted, SimpleThreadsCrashes uses the original SimpleThreads
    method for thread shutdown. This is known to leak memory (see the
    original comments in -[Controller killThreads:]), and does not kill
    the threads, but should give you the __CFRunLookFindMode crash.

    If you uncomment the SIMPLE_THREADS_FIX define in Controller.h, it
    will try and use the fix to SimpleThreads suggested by John Nairn
    (http://www.cocoabuilder.com/archive/message/cocoa/2002/8/23/51214).
    This kills most of the threads, although a few will survive. It
    usually crashes, but not in __CFRunLookFindMode, but somewhere else
    during NSConnection setup.

    Finally, if you comment out the ORIGINAL_RUN_LOOP define in
    Controller.h, a slightly different strategy will be used to drive the
    runLoop. A boolean shouldRun flag will be set on the TransferServer,
    and as long as this is true, [[NSRunLoop currentRunLoop]
    runMode:NSDefaultRunLoopMode beforeDate:newDate] will be called
    repeatedly. When the shouldRun flag is set to NO, execution will fall
    out of the loop, and the thread will stop.

    This version appears to shutdown all of the worker threads, and
    reliably produces the __CFRunLookFindMode crash.

    Note that the crash reliably brings down Apple Mail as well for me,
    and possibly other apps as well.

    Any clues, hints, or suggestions for alternative approaches
    gratefully received. I'm just about to open a new Radar bug on this
    one ...

        cheers,
              Martin

    On 14 Nov 2006, at 17:15, j o a r wrote:

    >
    > On 14 nov 2006, at 17.53, Martin Redington wrote:
    >
    >>> From the best I, and an Apple QA Engineer, could diagnose,
    >>> something was fiddling with the Core Foundation shared library
    >>> space. While I still have a sneaking suspicion that there is a
    >>> bug lurking around in there, the main problem that was causing my
    >>> issues was that Chax[1] was doing something nasty that it
    >>> shouldn't have been doing, and thus crashing Core Foundation, and
    >>> taking some of Apple's applications that were intimately tied to
    >>> CF at the time along with it (in my blog posting, Preview and
    >>> Mail.app)
    >>
    >> That's interesting. My dev system is pretty stock, but I have seen
    >> Mail.app go down quite a few times while I've been debugging. I
    >> was already beginning to suspect some connection.
    >
    > My suggestion: Before you do anything else, make absolutely sure
    > that you haven't got any system extensions, haxies, Input Managers
    > or other malware installed on your machine!
    >
    >> I can reliably replicate the issue (and another possible multi-
    >> threading issue with NSUserDefaults that I posted about earlier).
    >
    > Post the code and let others take a look.
    >
    > j o a r
    >
    >
  • I just noticed that in the code I posted, both the

    ORIGINAL_RUN_LOOP
    SIMPLE_THREADS_FIX

    defines are uncommented. Comment out the SIMPLE_THREADS_FIX in
    Controller.h to get the behaviour of the original Apple SimpleThreads
    code.

    Begin forwarded message:

    > From: Martin Redington <m.redington...>
    > Date: 15 November 2006 17:10:40 GMT
    > To: Cocoa-Dev Mail <cocoa-dev...>
    > Cc: j o a r <joar...>
    > Subject: Re: CFRunLookFindMode crash
    >
    >
    > ok. I've posted SimpleThreadsCrashes.dmg at
    >
    > http://www.mildmanneredindustries.com/downloads/SimpleThreadsCrash.dmg
    >
    > This contains a modified version of Apple's SimpleThreads example.
    > To see the problem, run it, and hit the "Lots of Threads" button.
    >
    > This will create a block of 200 threads, sleep for four seconds,
    > (try to) destroy the threads. It will try and run 10 such blocks.
    > The threads don't try to do anything. Wait a while, and it will crash.
    >
    > Note that although large numbers of threads are required in the
    > example, to generate the Crash, in our app we can do this with just
    > two threads.
    >
    > For best results, run it in the Debugger, as it will often take
    > XCode down if you just "run" it from XCode. Use Activity Monitor to
    > observe thread creation and destruction.
    >
    > I've commented out lots of the original SimpleThreads code that
    > isn't used here.
    >
    > As posted, SimpleThreadsCrashes uses the original SimpleThreads
    > method for thread shutdown. This is known to leak memory (see the
    > original comments in -[Controller killThreads:]), and does not kill
    > the threads, but should give you the __CFRunLookFindMode crash.
    >
    > If you uncomment the SIMPLE_THREADS_FIX define in Controller.h, it
    > will try and use the fix to SimpleThreads suggested by John Nairn
    > (http://www.cocoabuilder.com/archive/message/cocoa/
    > 2002/8/23/51214). This kills most of the threads, although a few
    > will survive. It usually crashes, but not in __CFRunLookFindMode,
    > but somewhere else during NSConnection setup.
    >
    > Finally, if you comment out the ORIGINAL_RUN_LOOP define in
    > Controller.h, a slightly different strategy will be used to drive
    > the runLoop. A boolean shouldRun flag will be set on the
    > TransferServer, and as long as this is true, [[NSRunLoop
    > currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:newDate]
    > will be called repeatedly. When the shouldRun flag is set to NO,
    > execution will fall out of the loop, and the thread will stop.
    >
    > This version appears to shutdown all of the worker threads, and
    > reliably produces the __CFRunLookFindMode crash.
    >
    > Note that the crash reliably brings down Apple Mail as well for me,
    > and possibly other apps as well.
    >
    > Any clues, hints, or suggestions for alternative approaches
    > gratefully received. I'm just about to open a new Radar bug on this
    > one ...
    >
    > cheers,
    > Martin
    >
    >
    > On 14 Nov 2006, at 17:15, j o a r wrote:
    >
    >>
    >> On 14 nov 2006, at 17.53, Martin Redington wrote:
    >>
    >>>> From the best I, and an Apple QA Engineer, could diagnose,
    >>>> something was fiddling with the Core Foundation shared library
    >>>> space. While I still have a sneaking suspicion that there is a
    >>>> bug lurking around in there, the main problem that was causing
    >>>> my issues was that Chax[1] was doing something nasty that it
    >>>> shouldn't have been doing, and thus crashing Core Foundation,
    >>>> and taking some of Apple's applications that were intimately
    >>>> tied to CF at the time along with it (in my blog posting,
    >>>> Preview and Mail.app)
    >>>
    >>> That's interesting. My dev system is pretty stock, but I have
    >>> seen Mail.app go down quite a few times while I've been
    >>> debugging. I was already beginning to suspect some connection.
    >>
    >> My suggestion: Before you do anything else, make absolutely sure
    >> that you haven't got any system extensions, haxies, Input Managers
    >> or other malware installed on your machine!
    >>
    >>> I can reliably replicate the issue (and another possible multi-
    >>> threading issue with NSUserDefaults that I posted about earlier).
    >>
    >> Post the code and let others take a look.
    >>
    >> j o a r
    >>
    >>
    >
    > _______________________________________________
    > Do not post admin requests to the list. They will be ignored.
    > Cocoa-dev mailing list      (<Cocoa-dev...>)
    > Help/Unsubscribe/Update your Subscription:
    > http://lists.apple.com/mailman/options/cocoa-dev/m.redington%
    > 40ucl.ac.uk
    >
    > This email sent to <m.redington...>
    >
  • Having filed my Radar bug (4838357), I'm just soak testing a
    workaround in my app which tries to avoid the situation where the
    crash occurs.

    In the meantime, I booted up an old PowerPC box, to try and replicate
    the error there.

    Sure enough, I get an identical error on PowerPC (my main box, and
    previous results were on intel). I tried using the debug frameworks,
    but the SimpleThreads variant wouldn't run, due to a font issue
    unrelated to this bug. An alternative, command line based test
    program (that I can't post because it contains proprietary code) did
    run with the debug frameworks.

    A screenshot of the debugger, with crash is at http://
    www.mildmanneredindustries.com/graphics/CFRunLoopFindModePPC.png

    I couldn't work out how to tell XCode/GDB where to find the source
    for CFRunLoop, but in CF-268.27, line 371 is as shown below:

    /* call with rl locked */
    static CFRunLoopModeRef __CFRunLoopFindMode(CFRunLoopRef rl,
    CFStringRef modeName, Boolean create) {
        CFRunLoopModeRef rlm;
        struct __CFRunLoopMode srlm;
        srlm._base._isa = __CFISAForTypeID(__kCFRunLoopModeTypeID);
        srlm._base._info = 0;
        _CFRuntimeSetInstanceTypeID(&srlm, __kCFRunLoopModeTypeID); //
    Line 371
        srlm._name = modeName;

    and _CFRuntimeSetInstanceTypeID is pretty trivial:

    void _CFRuntimeSetInstanceTypeID(CFTypeRef cf, CFTypeID typeID) {
        __CFBitfieldSetValue(((CFRuntimeBase *)cf)->_info, 15, 8, typeID);
    }

    One other thing that seems worth noting. In the test program, the
    crash occurs quite late on in the sequence of blocks. Richard Low
    reports a similar sounding, DO related issue at http://
    www.wentnet.com/misc/nsproxy.html, where the crash always occurred at
    a certain point in the program (although I'm not certain that this
    was in CFRunLoopFindMode).

    It almost sounds as though some of counter issue, although I'm not
    sure how that relates to the code where the problem appears to be
    located.

    On 15 Nov 2006, at 18:36, Martin Redington wrote:

    >
    > I just noticed that in the code I posted, both the
    >
    > ORIGINAL_RUN_LOOP
    > SIMPLE_THREADS_FIX
    >
    > defines are uncommented. Comment out the SIMPLE_THREADS_FIX in
    > Controller.h to get the behaviour of the original Apple
    > SimpleThreads code.
    >
    >
    > Begin forwarded message:
    >
    >> From: Martin Redington <m.redington...>
    >> Date: 15 November 2006 17:10:40 GMT
    >> To: Cocoa-Dev Mail <cocoa-dev...>
    >> Cc: j o a r <joar...>
    >> Subject: Re: CFRunLookFindMode crash
    >>
    >>
    >> ok. I've posted SimpleThreadsCrashes.dmg at
    >>
    >> http://www.mildmanneredindustries.com/downloads/
    >> SimpleThreadsCrash.dmg
    >>
    >> This contains a modified version of Apple's SimpleThreads example.
    >> To see the problem, run it, and hit the "Lots of Threads" button.
    >>
    >> This will create a block of 200 threads, sleep for four seconds,
    >> (try to) destroy the threads. It will try and run 10 such blocks.
    >> The threads don't try to do anything. Wait a while, and it will
    >> crash.
    >>
    >> Note that although large numbers of threads are required in the
    >> example, to generate the Crash, in our app we can do this with
    >> just two threads.
    >>
    >> For best results, run it in the Debugger, as it will often take
    >> XCode down if you just "run" it from XCode. Use Activity Monitor
    >> to observe thread creation and destruction.
    >>
    >> I've commented out lots of the original SimpleThreads code that
    >> isn't used here.
    >>
    >> As posted, SimpleThreadsCrashes uses the original SimpleThreads
    >> method for thread shutdown. This is known to leak memory (see the
    >> original comments in -[Controller killThreads:]), and does not
    >> kill the threads, but should give you the __CFRunLookFindMode crash.
    >>
    >> If you uncomment the SIMPLE_THREADS_FIX define in Controller.h, it
    >> will try and use the fix to SimpleThreads suggested by John Nairn
    >> (http://www.cocoabuilder.com/archive/message/cocoa/
    >> 2002/8/23/51214). This kills most of the threads, although a few
    >> will survive. It usually crashes, but not in __CFRunLookFindMode,
    >> but somewhere else during NSConnection setup.
    >>
    >> Finally, if you comment out the ORIGINAL_RUN_LOOP define in
    >> Controller.h, a slightly different strategy will be used to drive
    >> the runLoop. A boolean shouldRun flag will be set on the
    >> TransferServer, and as long as this is true, [[NSRunLoop
    >> currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:newDate]
    >> will be called repeatedly. When the shouldRun flag is set to NO,
    >> execution will fall out of the loop, and the thread will stop.
    >>
    >> This version appears to shutdown all of the worker threads, and
    >> reliably produces the __CFRunLookFindMode crash.
    >>
    >> Note that the crash reliably brings down Apple Mail as well for
    >> me, and possibly other apps as well.
    >>
    >> Any clues, hints, or suggestions for alternative approaches
    >> gratefully received. I'm just about to open a new Radar bug on
    >> this one ...
    >>
    >> cheers,
    >> Martin
    >>
    >>
    >> On 14 Nov 2006, at 17:15, j o a r wrote:
    >>
    >>>
    >>> On 14 nov 2006, at 17.53, Martin Redington wrote:
    >>>
    >>>>> From the best I, and an Apple QA Engineer, could diagnose,
    >>>>> something was fiddling with the Core Foundation shared library
    >>>>> space. While I still have a sneaking suspicion that there is a
    >>>>> bug lurking around in there, the main problem that was causing
    >>>>> my issues was that Chax[1] was doing something nasty that it
    >>>>> shouldn't have been doing, and thus crashing Core Foundation,
    >>>>> and taking some of Apple's applications that were intimately
    >>>>> tied to CF at the time along with it (in my blog posting,
    >>>>> Preview and Mail.app)
    >>>>
    >>>> That's interesting. My dev system is pretty stock, but I have
    >>>> seen Mail.app go down quite a few times while I've been
    >>>> debugging. I was already beginning to suspect some connection.
    >>>
    >>> My suggestion: Before you do anything else, make absolutely sure
    >>> that you haven't got any system extensions, haxies, Input
    >>> Managers or other malware installed on your machine!
    >>>
    >>>> I can reliably replicate the issue (and another possible multi-
    >>>> threading issue with NSUserDefaults that I posted about earlier).
    >>>
    >>> Post the code and let others take a look.
    >>>
    >>> j o a r
    >>>
    >>>
    >>
    >> _______________________________________________
    >> Do not post admin requests to the list. They will be ignored.
    >> Cocoa-dev mailing list      (<Cocoa-dev...>)
    >> Help/Unsubscribe/Update your Subscription:
    >> http://lists.apple.com/mailman/options/cocoa-dev/m.redington%
    >> 40ucl.ac.uk
    >>
    >> This email sent to <m.redington...>
    >>
    >
    > _______________________________________________
    > Do not post admin requests to the list. They will be ignored.
    > Cocoa-dev mailing list      (<Cocoa-dev...>)
    > Help/Unsubscribe/Update your Subscription:
    > http://lists.apple.com/mailman/options/cocoa-dev/m.redington%
    > 40ucl.ac.uk
    >
    > This email sent to <m.redington...>
    >
  • I'm using this method to terminate a DO thread. Seems to work fine.

    To terminate a thread, the client call [server terminate] over DO,
    which will cause the server to call [client setServer:nil] over DO
    and release the server proxy before the server thread exit.

    //// Server Class

    + (void)connectWithPorts:(NSArray *)portArray
    {
        NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
        Server *instance = [[self alloc] init];

        NSConnection *connection =
            [[NSConnection alloc] initWithReceivePort:[portArray
    objectAtIndex:0]
                                              sendPort:[portArray
    objectAtIndex:1]];
        [connection setRootObject:instance];

        // Connect with the client
        [(id <Client>)[connection rootProxy] setServer:instance];

        // Run until termination
        do {
            [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode
                                      beforeDate:[NSDate distantFuture]];
        } while ([instance isRunning]);

        // Disconnect from the client. If we don't do this, instnace is
        // never released!
        [(id <Client>)[connection rootProxy] setServer:nil];

        [connection release];
        [instance release];
        [pool release];
    }

    - (void)terminate
    {
        running = NO;
    }

    - (BOOL)isRunning
    {
        return running;
    }

    //// Client Class

    - (void)setServer:(id)anObject
    {
        [anObject retain];
        [anObject setProtocolForProxy:@protocol(Server)];
        [server release];
        server = (id <Server>)anObject;
    }

    On 15/11/2006, at 19:10, Martin Redington wrote:

    >
    > ok. I've posted SimpleThreadsCrashes.dmg at
    >
    > http://www.mildmanneredindustries.com/downloads/SimpleThreadsCrash.dmg
    >
    > This contains a modified version of Apple's SimpleThreads example.
    > To see the problem, run it, and hit the "Lots of Threads" button.
    >
    > This will create a block of 200 threads, sleep for four seconds,
    > (try to) destroy the threads. It will try and run 10 such blocks.
    > The threads don't try to do anything. Wait a while, and it will crash.
    >
    > Note that although large numbers of threads are required in the
    > example, to generate the Crash, in our app we can do this with just
    > two threads.
    >
    > For best results, run it in the Debugger, as it will often take
    > XCode down if you just "run" it from XCode. Use Activity Monitor to
    > observe thread creation and destruction.
    >
    > I've commented out lots of the original SimpleThreads code that
    > isn't used here.
    >
    > As posted, SimpleThreadsCrashes uses the original SimpleThreads
    > method for thread shutdown. This is known to leak memory (see the
    > original comments in -[Controller killThreads:]), and does not kill
    > the threads, but should give you the __CFRunLookFindMode crash.
    >
    > If you uncomment the SIMPLE_THREADS_FIX define in Controller.h, it
    > will try and use the fix to SimpleThreads suggested by John Nairn
    > (http://www.cocoabuilder.com/archive/message/cocoa/
    > 2002/8/23/51214). This kills most of the threads, although a few
    > will survive. It usually crashes, but not in __CFRunLookFindMode,
    > but somewhere else during NSConnection setup.
    >
    > Finally, if you comment out the ORIGINAL_RUN_LOOP define in
    > Controller.h, a slightly different strategy will be used to drive
    > the runLoop. A boolean shouldRun flag will be set on the
    > TransferServer, and as long as this is true, [[NSRunLoop
    > currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:newDate]
    > will be called repeatedly. When the shouldRun flag is set to NO,
    > execution will fall out of the loop, and the thread will stop.
    >
    > This version appears to shutdown all of the worker threads, and
    > reliably produces the __CFRunLookFindMode crash.
    >
    > Note that the crash reliably brings down Apple Mail as well for me,
    > and possibly other apps as well.
    >
    > Any clues, hints, or suggestions for alternative approaches
    > gratefully received. I'm just about to open a new Radar bug on this
    > one ...
    >
    > cheers,
    > Martin
    >
    >
    > On 14 Nov 2006, at 17:15, j o a r wrote:
    >
    >>
    >> On 14 nov 2006, at 17.53, Martin Redington wrote:
    >>
    >>>> From the best I, and an Apple QA Engineer, could diagnose,
    >>>> something was fiddling with the Core Foundation shared library
    >>>> space. While I still have a sneaking suspicion that there is a
    >>>> bug lurking around in there, the main problem that was causing
    >>>> my issues was that Chax[1] was doing something nasty that it
    >>>> shouldn't have been doing, and thus crashing Core Foundation,
    >>>> and taking some of Apple's applications that were intimately
    >>>> tied to CF at the time along with it (in my blog posting,
    >>>> Preview and Mail.app)
    >>>
    >>> That's interesting. My dev system is pretty stock, but I have
    >>> seen Mail.app go down quite a few times while I've been
    >>> debugging. I was already beginning to suspect some connection.
    >>
    >> My suggestion: Before you do anything else, make absolutely sure
    >> that you haven't got any system extensions, haxies, Input Managers
    >> or other malware installed on your machine!
    >>
    >>> I can reliably replicate the issue (and another possible multi-
    >>> threading issue with NSUserDefaults that I posted about earlier).
    >>
    >> Post the code and let others take a look.
    >>
    >> j o a r
    >>
    >>
    >
    > _______________________________________________
    > Do not post admin requests to the list. They will be ignored.
    > Cocoa-dev mailing list      (<Cocoa-dev...>)
    > Help/Unsubscribe/Update your Subscription:
    > http://lists.apple.com/mailman/options/cocoa-dev/<nirs...>
    >
    > This email sent to <nirs...>

    Best Regards,

    Nir Soffer
  • I'm sure it does *seem* to work fine - my "modern" threading code
    looks almost exactly like this (http://developer.apple.com/
    documentation/Cocoa/Conceptual/Multithreading/articles/
    CocoaDOComm.html provides a model for this). This is pretty much
    identical to the code I posted, if the ORIGINAL_RUN_LOOP define is
    commented out.

    In my application, in normal conditions, extrapolating from some test
    cases, it would take 50 hours for the user to hit this issue, based
    on a new (shortlived) thread being created every five minutes.
    Unfortunately, this is a real use case for the app. It takes a lot
    less time if you have more than one worker thread running
    simultaneously - this seems to flush the issue out of the woodwork.

    If you're just creating a few worker threads, one at a time, or even
    quite a few worker threads, you probably won't have seen the issue -
    we have thousands of users using the current version of the app, and
    although I suspect some of them are hitting the issue, we only really
    got a clear handle on when we introduced a polling method, which
    creates a new thread every five minutes.

        cheers,
              Martin

    On 16 Nov 2006, at 09:30, Nir Soffer wrote:

    > I'm using this method to terminate a DO thread. Seems to work fine.
    >
    > To terminate a thread, the client call [server terminate] over DO,
    > which will cause the server to call [client setServer:nil] over DO
    > and release the server proxy before the server thread exit.
    >
    >
    > //// Server Class
    >
    > + (void)connectWithPorts:(NSArray *)portArray
    > {
    > NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    > Server *instance = [[self alloc] init];
    >
    > NSConnection *connection =
    > [[NSConnection alloc] initWithReceivePort:[portArray
    > objectAtIndex:0]
    > sendPort:[portArray
    > objectAtIndex:1]];
    > [connection setRootObject:instance];
    >
    > // Connect with the client
    > [(id <Client>)[connection rootProxy] setServer:instance];
    >
    > // Run until termination
    > do {
    > [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode
    > beforeDate:[NSDate distantFuture]];
    > } while ([instance isRunning]);
    >
    > // Disconnect from the client. If we don't do this, instnace is
    > // never released!
    > [(id <Client>)[connection rootProxy] setServer:nil];
    >
    > [connection release];
    > [instance release];
    > [pool release];
    > }
    >
    > - (void)terminate
    > {
    > running = NO;
    > }
    >
    > - (BOOL)isRunning
    > {
    > return running;
    > }
    >
    >
    > //// Client Class
    >
    > - (void)setServer:(id)anObject
    > {
    > [anObject retain];
    > [anObject setProtocolForProxy:@protocol(Server)];
    > [server release];
    > server = (id <Server>)anObject;
    > }
    >
    >
    >
    > On 15/11/2006, at 19:10, Martin Redington wrote:
    >
    >>
    >> ok. I've posted SimpleThreadsCrashes.dmg at
    >>
    >> http://www.mildmanneredindustries.com/downloads/
    >> SimpleThreadsCrash.dmg
    >>
    >> This contains a modified version of Apple's SimpleThreads example.
    >> To see the problem, run it, and hit the "Lots of Threads" button.
    >>
    >> This will create a block of 200 threads, sleep for four seconds,
    >> (try to) destroy the threads. It will try and run 10 such blocks.
    >> The threads don't try to do anything. Wait a while, and it will
    >> crash.
    >>
    >> Note that although large numbers of threads are required in the
    >> example, to generate the Crash, in our app we can do this with
    >> just two threads.
    >>
    >> For best results, run it in the Debugger, as it will often take
    >> XCode down if you just "run" it from XCode. Use Activity Monitor
    >> to observe thread creation and destruction.
    >>
    >> I've commented out lots of the original SimpleThreads code that
    >> isn't used here.
    >>
    >> As posted, SimpleThreadsCrashes uses the original SimpleThreads
    >> method for thread shutdown. This is known to leak memory (see the
    >> original comments in -[Controller killThreads:]), and does not
    >> kill the threads, but should give you the __CFRunLookFindMode crash.
    >>
    >> If you uncomment the SIMPLE_THREADS_FIX define in Controller.h, it
    >> will try and use the fix to SimpleThreads suggested by John Nairn
    >> (http://www.cocoabuilder.com/archive/message/cocoa/
    >> 2002/8/23/51214). This kills most of the threads, although a few
    >> will survive. It usually crashes, but not in __CFRunLookFindMode,
    >> but somewhere else during NSConnection setup.
    >>
    >> Finally, if you comment out the ORIGINAL_RUN_LOOP define in
    >> Controller.h, a slightly different strategy will be used to drive
    >> the runLoop. A boolean shouldRun flag will be set on the
    >> TransferServer, and as long as this is true, [[NSRunLoop
    >> currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:newDate]
    >> will be called repeatedly. When the shouldRun flag is set to NO,
    >> execution will fall out of the loop, and the thread will stop.
    >>
    >> This version appears to shutdown all of the worker threads, and
    >> reliably produces the __CFRunLookFindMode crash.
    >>
    >> Note that the crash reliably brings down Apple Mail as well for
    >> me, and possibly other apps as well.
    >>
    >> Any clues, hints, or suggestions for alternative approaches
    >> gratefully received. I'm just about to open a new Radar bug on
    >> this one ...
    >>
    >> cheers,
    >> Martin
    >>
    >>
    >> On 14 Nov 2006, at 17:15, j o a r wrote:
    >>
    >>>
    >>> On 14 nov 2006, at 17.53, Martin Redington wrote:
    >>>
    >>>>> From the best I, and an Apple QA Engineer, could diagnose,
    >>>>> something was fiddling with the Core Foundation shared library
    >>>>> space. While I still have a sneaking suspicion that there is a
    >>>>> bug lurking around in there, the main problem that was causing
    >>>>> my issues was that Chax[1] was doing something nasty that it
    >>>>> shouldn't have been doing, and thus crashing Core Foundation,
    >>>>> and taking some of Apple's applications that were intimately
    >>>>> tied to CF at the time along with it (in my blog posting,
    >>>>> Preview and Mail.app)
    >>>>
    >>>> That's interesting. My dev system is pretty stock, but I have
    >>>> seen Mail.app go down quite a few times while I've been
    >>>> debugging. I was already beginning to suspect some connection.
    >>>
    >>> My suggestion: Before you do anything else, make absolutely sure
    >>> that you haven't got any system extensions, haxies, Input
    >>> Managers or other malware installed on your machine!
    >>>
    >>>> I can reliably replicate the issue (and another possible multi-
    >>>> threading issue with NSUserDefaults that I posted about earlier).
    >>>
    >>> Post the code and let others take a look.
    >>>
    >>> j o a r
    >>>
    >>>
    >>
    >> _______________________________________________
    >> Do not post admin requests to the list. They will be ignored.
    >> Cocoa-dev mailing list      (<Cocoa-dev...>)
    >> Help/Unsubscribe/Update your Subscription:
    >> http://lists.apple.com/mailman/options/cocoa-dev/<nirs...>
    >>
    >> This email sent to <nirs...>
    >
    >
    > Best Regards,
    >
    > Nir Soffer
    >
  • On 17 Nov 2006, at 08:43, Nir Soffer wrote:

    >> In my application, in normal conditions, extrapolating from some
    >> test cases, it would take 50 hours for the user to hit this issue,
    >> based on a new (shortlived) thread being created every five
    >> minutes. Unfortunately, this is a real use case for the app. It
    >> takes a lot less time if you have more than one worker thread
    >> running simultaneously - this seems to flush the issue out of the
    >> woodwork.

    Why not keep one or few background thread for polling instead of
    recreating them?

    Best Regards,

    Nir Soffer
  • I'm using thread pooling now, which seems to work, although I had
    some problems with that before.

    In the meantime, I've also uncovered (with lots of help) the
    underlying source of the problem - its a mach_port leakage issue -
    the Apple (and other) sample code appears to leak mach_ports like a
    sieve.

    See http://www.cocoadev.com/index.pl?
    DistributedObjectsForInterThreadCommsCrash for the gory details, and
    a fix.

    On 17 Nov 2006, at 06:59, Nir Soffer wrote:

    >
    > On 17 Nov 2006, at 08:43, Nir Soffer wrote:
    > In my application, in normal conditions, extrapolating from some
    > test cases, it would take 50 hours for the user to hit this issue,
    > based on a new (shortlived) thread being created every five
    > minutes. Unfortunately, this is a real use case for the app. It
    > takes a lot less time if you have more than one worker thread
    > running simultaneously - this seems to flush the issue out of the
    > woodwork.
    >
    > Why not keep one or few background thread for polling instead of
    > recreating them?
    >
    >
    > Best Regards,
    >
    > Nir Soffer
  • On 17 Nov 2006, at 15:44, Martin Redington wrote:

    > In the meantime, I've also uncovered (with lots of help) the
    > underlying source of the problem - its a mach_port leakage issue -
    > the Apple (and other) sample code appears to leak mach_ports like a
    > sieve.
    >
    > See http://www.cocoadev.com/index.pl?
    > DistributedObjectsForInterThreadCommsCrash for the gory details,
    > and a fix.

    Thanks for sharing. I'm sure many developers would find this useful -
    unlike those top secret bug reports.

    I can confirm that my code (posted before) also leak 2 ports for
    every thread I create and destroy.

    Best Regards,

    Nir Soffer
  • I recently had reason to look at this issue again.

    As it turns out, although the solution we found doesn't leak
    mach_ports, once you start doing stuff on the threads, it starts
    leaking mach_ports again. In my app, I seem to leak a couple of ports
    for each thread.

    I took a look with OmniObjectMeter, and as far as I could make out,
    it looked like these were being over-retained by a runLoop. Typically
    I'd see two or three extra retains. In the original SimpleThreads
    example this is hinted at in a comment by timc:

    /*
    TIMC
    Currently, this routine does not work properly.  Both connection
    objects seem to be retained many more times than necessary -- and the
    number goes up as you make server calls.  So, this is close to what
    is needed, but clearly something else is going on in DO.
    */

    As I'm pooling threads in my app, this doesn't bite me, but if I
    disable thread pooling, and run with the solution described earlier
    in the thread, I leak mach_ports, and eventually die with the
    familiar CFRunLookFindMode crash. Unfortunately, I don't have a test
    case I can post right now, as the code that "does stuff" is proprietary.

    I can't believe this is really the case - presumably DO is in fairly
    constant use, so surely this would have come up before now - but I
    can't work out what I'm doing wrong.

    Should I just expect some leakage?

    On 17 Nov 2006, at 14:27, Nir Soffer wrote:

    >
    > On 17 Nov 2006, at 15:44, Martin Redington wrote:
    >
    >> In the meantime, I've also uncovered (with lots of help) the
    >> underlying source of the problem - its a mach_port leakage issue -
    >> the Apple (and other) sample code appears to leak mach_ports like
    >> a sieve.
    >>
    >> See http://www.cocoadev.com/index.pl?
    >> DistributedObjectsForInterThreadCommsCrash for the gory details,
    >> and a fix.
    >
    > Thanks for sharing. I'm sure many developers would find this useful
    > - unlike those top secret bug reports.
    >
    > I can confirm that my code (posted before) also leak 2 ports for
    > every thread I create and destroy.
    >
    >
    > Best Regards,
    >
    > Nir Soffer
    >
  • On 4 dec 2006, at 10.41, Martin Redington wrote:

    > I can't believe this is really the case - presumably DO is in
    > fairly constant use, so surely this would have come up before now -
    > but I can't work out what I'm doing wrong.

    I don't think that DO is in much use in "serious" (ie. large and / or
    commercial) applications, so I wouldn't be surprised if bugs were to
    crop up.

    > Should I just expect some leakage?

    No. If you can reproduce this problem in a consistent way, you should
    definitively file a bug report. This sound serious enough that it
    should be fixed pretty quickly.

    j o a r
  • Damn and blast.

    I tried to adapt the Mandelbrot Example from Advanced Mac OS X
    programming to show this (http://www.borkware.com/corebook/second-
    edition-files/23-do-threads.tar.gz for the original), as it actually
    does something in the worker thread.

    Sure enough, if I just stopped the threads it leaked ports, but with
    the fixes to the original issue on this thread it stopped. I'm still
    kind of convinced this is real though - I mean, if all of the example
    code leaked ports originally, I'm willing to believe that issue is
    real too, but just that most people don't create enough threads over
    time to hit it. Possibly some of the other code within the worker
    thread is leaking ports independently.

    I was casually watching what other apps do wrt threads and ports -
    the one thing I noticed is that Safari creates and destroys a lot of
    threads (at least if you have as many windows open as I do), bu that
    it cleans them all up nicely after itself, so presumably there is a
    right way to do it (although I don't know whether Safari uses DO for
    that).

    On 4 Dec 2006, at 17:57, j o a r wrote:

    >
    > On 4 dec 2006, at 10.41, Martin Redington wrote:
    >
    >> I can't believe this is really the case - presumably DO is in
    >> fairly constant use, so surely this would have come up before now
    >> - but I can't work out what I'm doing wrong.
    >
    > I don't think that DO is in much use in "serious" (ie. large and /
    > or commercial) applications, so I wouldn't be surprised if bugs
    > were to crop up.
    >
    >> Should I just expect some leakage?
    >
    > No. If you can reproduce this problem in a consistent way, you
    > should definitively file a bug report. This sound serious enough
    > that it should be fixed pretty quickly.
    >
    > j o a r
    >
    >
previous month november 2006 next month
MTWTFSS
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      
Go to today