Understanding objc_assign_strongCast

  • Hi,

    I am having a lot of trouble understanding exactly what the compiler-
    generated objc_assign_strongCast is supposed to do.  The only
    documentation on this is one sentence in the Objective-C release notes.

    From http://developer.apple.com/releasenotes/Cocoa/RN-ObjectiveC/
    > The compiler uses three "helper" functions for assignments of strong
    > pointers to garbage collected memory into global memory
    > (objc_assign_global), garbage collected heap memory
    > (objc_assign_ivar), or into unknown memory (objc_assign_strongCast).
    > For assignments of weak pointers it uses objc_assign_weak and for
    > reads it uses objc_read_weak.
    >

    From what I can tell the compiler (Objective-C++ in my case)
    generates calls to objc_assign_strongCast whenever assigning an
    Objective-C pointer that is an i-var of a C++ class.  I assume it
    would do something similar when assigning a field in a plain-old C
    struct.

    But I have yet to find a single case where this keeps the object from
    being finalized!  I initially thought it wasn't working when the C++
    struct was located in the global data area.  No problem, I followed
    Chris Hanson's advice and introduced CFRetain/CFRelease calls.

    But now I've come across a case where I am certain the C++ object is
    allocated on the C++ heap (via a regular new operator, not
    overloaded).  Even in this case, the objc_assign_strongCast which I am
    reasonably certain the compiler is generating (it shows in the output
    of gcc -S around where I expect it to) completely fails to keep the
    Objective-C object from being finalized.  If that is not the purpose
    of objc_assign_strongCast then may I ask what is its purpose?

    Note that the i-var in this case is a plain-old NSColor*.  There is
    nothing fancy going on here.  The only solution I have found is to
    again apply CFRetain/CFRelease appropriately.

    So what gives?  What am I misunderstanding?

    -Dave
  • Dave,

    objc_assign_strongCast() will issue a write barrier, informing GC
    that the destination value has changed.  But if the only references
    to this pointer are in unscanned (not GC) memory, than the GC system
    will think it's dead as no references to that pointer exist in
    scanned (GC live) memory.

    The C++ new operator allocates from malloc(), just as before.
    malloc() memory is not GC scanned.  It's probably easiest to instead
    use CFRetain and balance it with CFRelease in delete/etc.

    In fact, I've written some pretty hairy hybrid memory model code, and
    have yet to find a use for calling objc_assign_strongCast() myself.

    The GC programming guide talks about using CFRetain/CFRelease in the
    section on Core Foundation objects.  The primary difference with C++
    objects is that C++'s new is from unscanned malloc memory where as CF
    types (with the default allocator) are allocated from scanned memory.
    --

    -Ben
  • Hi Ben,

    Thank you for your response.  It is most helpful to know that CFRetain/
    CFRelease are the most appropriate option for maintaining strong
    references from non-GC objects and that objc_assign_strongCast has
    zero effect unless the pointer itself is located in a GC-managed
    heap.  I've observed first-hand that it has zero effect in global
    memory and zero effect in non-GC heaps and from my understanding ought
    not to be needed on the stack.  Did I miss any other notable types of
    memory?

    Now that you mention CF objects it occurs to me that
    objc_assign_strongCast is intended for i-vars of CF objects located in
    GC-managed heap.  Or to put it another way, it is less related to
    objc_assign_global and more like objc_assign_ivar for i-vars that are
    in POD structs allocated in GC-managed heap.  Some time in the future
    I may endeavor to push the limits of this theory as it sounds quite
    tempting to apply the GC to other code.

    The only remaining question I have pertains to compatibility with
    prior OS X versions.  Clearly CFRetain/CFRelease work on Leopard.  And
    as far as I know code with GC-write barriers can only work on Leopard,
    even when it is compiled to support both RR and GC, due to the simple
    fact that the objc_assign_* series of functions do not exist on older
    versions of OS X.

    Keep in mind I am writing a library (wxCocoa) that must work in both
    RR and GC mode.  Therefore my usage of CFRetain/CFRelease will be
    balanced appropriately so as to function at runtime correctly
    regardless of mode.  However, if compiling for pure-RR mode is it
    still reasonable to use CFRetain/CFRelease in lieu of retain/release
    or will this cause me problems when targeting older versions of OS X?
    Do I need to condition CFRetain/CFRelease on garbage collection being
    potentially supported or is it safe to just judiciously replace retain/
    release with CFRetain/CFRelease and expect the same code to run on
    prior versions of OS X?

    Hopefully I understand enough now and hopefully your reply will prove
    useful in the archives to other people. Again, thanks for the response.

    -Dave

    On Feb 6, 2008, at 8:23 PM, Ben Trumbull wrote:

    > Dave,
    >
    > objc_assign_strongCast() will issue a write barrier, informing GC
    > that the destination value has changed.  But if the only references
    > to this pointer are in unscanned (not GC) memory, than the GC system
    > will think it's dead as no references to that pointer exist in
    > scanned (GC live) memory.
    >
    > The C++ new operator allocates from malloc(), just as before.
    > malloc() memory is not GC scanned.  It's probably easiest to instead
    > use CFRetain and balance it with CFRelease in delete/etc.
    >
    > In fact, I've written some pretty hairy hybrid memory model code,
    > and have yet to find a use for calling objc_assign_strongCast()
    > myself.
    >
    > The GC programming guide talks about using CFRetain/CFRelease in the
    > section on Core Foundation objects.  The primary difference with C++
    > objects is that C++'s new is from unscanned malloc memory where as
    > CF types (with the default allocator) are allocated from scanned
    > memory.
    > --
    >
    > -Ben
  • On Feb 6, 2008, at 5:23 PM, Ben Trumbull wrote:

    > objc_assign_strongCast() will issue a write barrier, informing GC
    > that the destination value has changed.  But if the only references
    > to this pointer are in unscanned (not GC) memory, than the GC system
    > will think it's dead as no references to that pointer exist in
    > scanned (GC live) memory.
    >
    > The C++ new operator allocates from malloc(), just as before.
    > malloc() memory is not GC scanned.  It's probably easiest to instead
    > use CFRetain and balance it with CFRelease in delete/etc.

    You could also define a custom "new" operator for the class in
    question that allocates its memory using NSAllocateCollectable(...,
    NSScannedOption), but that might be more trouble than it's worth.

    --Chris N.
  • Hi Chris,

    On Feb 7, 2008, at 12:50 PM, Christopher Nebel wrote:

    > On Feb 6, 2008, at 5:23 PM, Ben Trumbull wrote:
    >
    >> objc_assign_strongCast() will issue a write barrier, informing GC
    >> that the destination value has changed.  But if the only references
    >> to this pointer are in unscanned (not GC) memory, than the GC
    >> system will think it's dead as no references to that pointer exist
    >> in scanned (GC live) memory.
    >>
    >> The C++ new operator allocates from malloc(), just as before.
    >> malloc() memory is not GC scanned.  It's probably easiest to
    >> instead use CFRetain and balance it with CFRelease in delete/etc.
    >
    > You could also define a custom "new" operator for the class in
    > question that allocates its memory using NSAllocateCollectable(...,
    > NSScannedOption), but that might be more trouble than it's worth.
    >

    The problem then becomes: what references that object. Because
    obviously if that object doesn't have a strong reference, it will be
    considered finalizable and so will the other objects.

    But yes, the thought crossed my mind and I'm sure I will find such a
    technique useful in the future.

    -Dave
  • On Feb 6, 2008, at 11:50 PM, David Elliott wrote:

    > Thank you for your response.  It is most helpful to know that
    > CFRetain/CFRelease are the most appropriate option for maintaining
    > strong references from non-GC objects and that
    > objc_assign_strongCast has zero effect unless the pointer itself is
    > located in a GC-managed heap.  I've observed first-hand that it has
    > zero effect in global memory and zero effect in non-GC heaps and
    > from my understanding ought not to be needed on the stack.  Did I
    > miss any other notable types of memory?

    There is a separate objc_assign_global().  You *can* put GC objects
    into globals (assuming they are id or __strong types).
    objc_assign_ivar() is for assigning to an ivar of an object (something
    with an Objective-C Class).  There are some additional details in the
    GC programming guide, as well as the Leopard (10.5) release notes for
    Objective-C and Garbage Collection (some GC stuff is nestled away
    under the ObjC runtime)

    But that's mostly for entertainment purposes.  CFRetain and CFRelease
    work well to manage "external references" (stuff holding a GC object
    from non-GC memory) I've never needed to use any of the objc_assign*
    functions directly for any strong references.

    > The only remaining question I have pertains to compatibility with
    > prior OS X versions.  Clearly CFRetain/CFRelease work on Leopard.
    > And as far as I know code with GC-write barriers can only work on
    > Leopard, even when it is compiled to support both RR and GC, due to
    > the simple fact that the objc_assign_* series of functions do not
    > exist on older versions of OS X.

    GC only works on Leopard, but several of the objc_assign_ functions
    are available on Tiger (as their RR no-ops).  I *believe* you can
    compile mixed mode code for Leopard and Tiger so long as you do not
    use any __weak references, but I'm not certain.  I'd recommend trying
    a small sample project on Leopard and testing the binary on Tiger.

    > Keep in mind I am writing a library (wxCocoa) that must work in both
    > RR and GC mode.  Therefore my usage of CFRetain/CFRelease will be
    > balanced appropriately so as to function at runtime correctly
    > regardless of mode.

    That is possible, although I haven't tried adding in the twist of
    running on Tiger and Leopard.  Nearly all the system frameworks on
    Leopard run in both RR and GC mode.

    > However, if compiling for pure-RR mode is it still reasonable to use
    > CFRetain/CFRelease in lieu of retain/release or will this cause me
    > problems when targeting older versions of OS X?  Do I need to
    > condition CFRetain/CFRelease on garbage collection being potentially
    > supported or is it safe to just judiciously replace retain/release
    > with CFRetain/CFRelease and expect the same code to run on prior
    > versions of OS X?

    There are a few factors at work here, so typically one does not use
    CFRetain/CFRelease in lieu of retain/release wholesale.

    (1) CFRetain and CFRelease crash on NULL so you cannot dispatch to
    them blindly, as you can with -retain and -release.  You have to check
    for nil.
    (2) You don't have to replace all your use of -retain/release with
    CFRetain/CFRelease.  You only need it when you want to assign into non-
    GC memory or otherwise hold an external reference upon a block of GC
    memory.
    (3) You must balance a call to -retain/+alloc with -release/-
    autorelease.  You must balance a call to CFRetain/CFCreate... with a
    call to CFRelease.  You cannot balance -retain with CFRelease.  You
    cannot balance a call to CFCreate... with -release (this does work
    under RR, but breaks dual mode or GC code)
    (4) -retain, -release, -autorelease are no-ops under GC.

    Rules 3 &4 combine into an interesting pattern:

    id foo = [[NSObject alloc] init];    // RR count 1, GC external
    reference count 0 because +alloc does not force an er under GC
    if (foo) CFRetain((void*)foo); // RR count 2, GC er count 1
    [foo release]; // RR count 1, GC er count 1 because -release is a no-
    op under GC

    At this point 'foo' has the same hard reference count under both RR
    and GC.  It can be freed equally well under either mode with

    if (foo) CFRelease((void*)foo):  // RR count 0; GC er count 0
    // some point later either -dealloc or -finalize, if any, will get
    called

    You don't need this if 'foo' can be constructed by a CFCreate...
    function since CF functions create objects that have an er count of 1
    as they must be balanced by CFRelease.

    This pattern is useful to deal with a problem encountered by dual
    memory mode code.  -finalize is called in a completely indeterminate,
    random order.  You absolutely cannot message another object in -
    finalize unless you can guarantee it is alive.  That is really hard to
    know unless you put a CFRetain on it yourself, and CFRelease it in -
    finalize.  This includes your own ivars!  Your ivars can be finalized
    before you!

    Now even if you do have a finalize method, most of your ivars won't
    need to be messaged.  Most messages to ivars in -dealloc are -release,
    which you don't need at all in -finalize method.  So this is a special
    case.  In hundreds of classes, with plenty of ivars apiece, I've only
    used it 10 or 11 times.

    The problem is an issue with dual mode code, because it's much easier
    for GC only code to be structured with fewer (ideally zero) finalize
    methods.  finalize methods can be quite expensive so avoiding them is
    best.

    Writing a dual mode framework is not trivial.

    You could override the new operator to allocate scanned collectible
    memory.  I don't have any experience writing Objective-C++ code under
    GC, so I'm not sure it's any easier than the CFRetain approach.

    - Ben

    > On Feb 6, 2008, at 8:23 PM, Ben Trumbull wrote:
    >
    >> Dave,
    >>
    >> objc_assign_strongCast() will issue a write barrier, informing GC
    >> that the destination value has changed.  But if the only references
    >> to this pointer are in unscanned (not GC) memory, than the GC
    >> system will think it's dead as no references to that pointer exist
    >> in scanned (GC live) memory.
    >>
    >> The C++ new operator allocates from malloc(), just as before.
    >> malloc() memory is not GC scanned.  It's probably easiest to
    >> instead use CFRetain and balance it with CFRelease in delete/etc.
    >>
    >> In fact, I've written some pretty hairy hybrid memory model code,
    >> and have yet to find a use for calling objc_assign_strongCast()
    >> myself.
    >>
    >> The GC programming guide talks about using CFRetain/CFRelease in
    >> the section on Core Foundation objects.  The primary difference
    >> with C++ objects is that C++'s new is from unscanned malloc memory
    >> where as CF types (with the default allocator) are allocated from
    >> scanned memory.
    >> --
    >>
    >> -Ben
    >