mutableBytes Creates Autoreleased Objects

  • It seems, sending mutableBytes creates autoreleased objects (currently, tested with ARC only).
    Anybody experienced this, too?

    In code as below this may severely impact performance and tie up lots of memory, which are apparently dependent on the size of the mutable data:

    for (int i = 0; i < BigValue; ++i) {
        NSMutableData* data = [[NSMutableData alloc] initWithCapacity:bigSize];
        char* p = [data mutableBytes];
        ...

        // data released by ARC
    }

    I could alleviate the problem by wrapping around an autorelease pool:

    for (int i = 0; i < BigValue; ++i)
    {
        NSMutableData* data = [[NSMutableData alloc] initWithCapacity:bigSize];
        char* p;
        @autoreleasepool {
    char* p = [data mutableBytes];
        }
        ...
    }

    (In practice, it would probably be more kosher to wrap the whole block within the loop. I did it so just to illustrate where the autoreleased objects will be created.)

    Honestly, that seems quite strange.
    I also would expect this to be mentioned in the docs.

    Regards
    Andreas
  • On May 12, 2012, at 08:27 , Andreas Grosam wrote:

    > It seems, sending mutableBytes creates autoreleased objects (currently, tested with ARC only).
    > Anybody experienced this, too?
    >
    > In code as below this may severely impact performance and tie up lots of memory, which are apparently dependent on the size of the mutable data:
    >
    > for (int i = 0; i < BigValue; ++i) {
    > NSMutableData* data = [[NSMutableData alloc] initWithCapacity:bigSize];
    > char* p = [data mutableBytes];
    > ...
    >
    > // data released by ARC
    > }
    >
    > I could alleviate the problem by wrapping around an autorelease pool:
    >
    >
    > for (int i = 0; i < BigValue; ++i)
    > {
    > NSMutableData* data = [[NSMutableData alloc] initWithCapacity:bigSize];
    > char* p;
    > @autoreleasepool {
    > char* p = [data mutableBytes];
    > }
    > ...
    > }

    No, the pointer returned by 'mutableBytes' is an interior pointer. It *isn't* an object pointer. (Well, as an implementation detail, it may happen to point to an object's memory, but it often doesn't.)

    As soon as the NSMutableData's lifetime ends -- when it reaches its dealloc -- the 'mutableBytes' pointer becomes instantly invalid.

    Interestingly, this appears to mean that NSData/NSMutableData has become as dangerous in ARC as it is in GC. The compiler may shorten the lifetime of an object variable so that the interior pointer is catastrophically invalidated. This is documented in section 6.1 of the clang ARC spec:

    http://clang.llvm.org/docs/AutomaticReferenceCounting.html#optimization

    but it also give an answer:

    NSMutableData* data __attribute__ ((objc_precise_lifetime)) = …

    P.S. I think there's also another, better solution, but it involves adding a method to NSData/NSMutableData via a category:

    - (void*) interiorBytes __attribute__ ((objc_returns_inner_pointer)) {
      return self.bytes;
    }

    - (void*) mutableInteriorBytes __attribute__ ((objc_returns_inner_pointer)) {
      return self.mutableBytes;
    }

    and never using bytes/mutableBytes directly ever again. Perhaps one day bytes/mutableBytes will themselves be marked this way.
  • On May 12, 2012, at 11:37 AM, Quincey Morris wrote:

    > No, the pointer returned by 'mutableBytes' is an interior pointer. It *isn't* an object pointer. (Well, as an implementation detail, it may happen to point to an object's memory, but it often doesn't.)
    >
    > As soon as the NSMutableData's lifetime ends -- when it reaches its dealloc -- the 'mutableBytes' pointer becomes instantly invalid.

    That's not necessarily so.  And/or requesting the mutableBytes may do the equivalent of retain+autorelease on the NSMutableData.

    Consider an inexact analog.  The -[NSString UTF8String] method seems to create an autoreleased NSData (or similar object) to hold the UTF-8-encoded C string that it returns.  It's not returning the NSData object, obviously, it's just using the autoreleased lifetime to manage the lifetime of the C string.

    Anyway, NSMutableData *could* be doing something similar.  I don't know if it is or isn't.  I believe that Andreas is saying that his testing shows that it is.  If it is, Apple may have done this specifically to avoid a problem with internal pointers in ARC.

    Regards,
    Ken
  • On May 12, 2012, at 11:37 AM, Quincey Morris wrote:

    > No, the pointer returned by 'mutableBytes' is an interior pointer. It *isn't* an object pointer. (Well, as an implementation detail, it may happen to point to an object's memory, but it often doesn't.)
    >
    > As soon as the NSMutableData's lifetime ends -- when it reaches its dealloc -- the 'mutableBytes' pointer becomes instantly invalid.
    >
    > Interestingly, this appears to mean that NSData/NSMutableData has become as dangerous in ARC as it is in GC. The compiler may shorten the lifetime of an object variable so that the interior pointer is catastrophically invalidated. This is documented in section 6.1 of the clang ARC spec:
    >
    > http://clang.llvm.org/docs/AutomaticReferenceCounting.html#optimization

    Ugh, so you’re saying that ARC isn’t actually as deterministic as we’ve been led to believe?

    > but it also give an answer:
    >
    > NSMutableData* data __attribute__ ((objc_precise_lifetime)) = …
    >
    > P.S. I think there's also another, better solution, but it involves adding a method to NSData/NSMutableData via a category:
    >
    > - (void*) interiorBytes __attribute__ ((objc_returns_inner_pointer)) {
    > return self.bytes;
    > }
    >
    > - (void*) mutableInteriorBytes __attribute__ ((objc_returns_inner_pointer)) {
    > return self.mutableBytes;
    > }
    >
    > and never using bytes/mutableBytes directly ever again. Perhaps one day bytes/mutableBytes will themselves be marked this way.

    On the headers on my system, bytes and mutableBytes *are* marked that way.

    Charles
  • On May 12, 2012, at 12:17 PM, Ken Thomases wrote:

    > That's not necessarily so.  And/or requesting the mutableBytes may do the equivalent of retain+autorelease on the NSMutableData.
    >
    > Consider an inexact analog.  The -[NSString UTF8String] method seems to create an autoreleased NSData (or similar object) to hold the UTF-8-encoded C string that it returns. It's not returning the NSData object, obviously, it's just using the autoreleased lifetime to manage the lifetime of the C string.
    >
    > Anyway, NSMutableData *could* be doing something similar.  I don't know if it is or isn't.  I believe that Andreas is saying that his testing shows that it is.  If it is, Apple may have done this specifically to avoid a problem with internal pointers in ARC.

    It looks like that’s indeed what it’s doing. I put a category on NSData to swizzle out its dealloc to something that would log that it was getting dealloced, and then ran a few test cases. Here’s where the data object got dealloced each time:

    int main(int argc, const char * argv[]) {
        @autoreleasepool {
            const char *bytes = "abcd";

            {
                NSMutableData *data = [[NSMutableData alloc] initWithBytes:bytes length:4];

                NSLog(@"leaving block");
            } // data gets dealloced here
            NSLog(@"left block");
        }
        return 0;
    }

    int main(int argc, const char * argv[]) {
        @autoreleasepool {
            const char *bytes = "abcd";
            char *mutableBytes;

            {
                NSMutableData *data = [[NSMutableData alloc] initWithBytes:bytes length:4];
                mutableBytes = [data mutableBytes];

                NSLog(@"leaving block");
            }
            NSLog(@"left block");
        } // data gets dealloced here
        return 0;
    }

    int main(int argc, const char * argv[]) {
        @autoreleasepool {
            const char *bytes = "abcd";
            char *mutableBytes;

            @autoreleasepool {
                NSMutableData *data = [[NSMutableData alloc] initWithBytes:bytes length:4];
                mutableBytes = [data mutableBytes];

                NSLog(@"leaving autoreleasepool block");
            } // data gets dealloced here
            NSLog(@"left autoreleasepool block");
        }
        return 0;
    }

    Charles
  • On May 12, 2012, at 10:17 , Ken Thomases wrote:

    > That's not necessarily so.  And/or requesting the mutableBytes may do the equivalent of retain+autorelease on the NSMutableData.
    >
    > Consider an inexact analog.  The -[NSString UTF8String] method seems to create an autoreleased NSData (or similar object) to hold the UTF-8-encoded C string that it returns.  It's not returning the NSData object, obviously, it's just using the autoreleased lifetime to manage the lifetime of the C string.

    I think the difference is that for UTF8String, there is an API contract that promises the result will be an object (and it has the lifetime behavior of any returned object that is returned with +0 retain semantics, as the documentation warns).

    For mutableBytes, there's no API contract that it's an object at all, or that it has any particular lifetime semantics if it is. If it was an object of some kind in Andrea's testing, that should be regarded as luck. Note that the documentation for mutable bytes *explicitly* warns that it may point to different things in different frameworks releases.

    On May 12, 2012, at 10:18 , Charles Srstka wrote:

    > Ugh, so you’re saying that ARC isn’t actually as deterministic as we’ve been led to believe?

    Indeterminism isn't the problem. Unmarked interior pointers *are*.

    > On the headers on my system, bytes and mutableBytes *are* marked that way.

    Yes, they're marked on my OS X 10.7 SDK. They're not marked on my iOS 5.1 SDK, which was the one I happened to look at.

    This means that the time of deallocation of any private memory object that mutableBytes might refer to is affected by at least the following factors:

    1. The optimization level. (Might shorten the compile-time scope of the NSMutableData object by varying amounts.)

    2. The SDK in use. (Might keep the compile-time scope of the NSMutableData object longer based on interior pointers, or not.)

    3. Whether the internals of NSMutableData use autorelease. (May vary in different frameworks versions, or for NSMutableData objects created with different initializer and/or parameters.)

    4. Whether the NSMutableData is autoreleased for any other reason.

    There's no free lunch here, unless the interior-pointer-returning methods are known to be marked as such. Then ARC hands out very free lunches indeed. :)

    P.S. Hmm. The scope extension from marked interior pointers doesn't really seem to be memory-model specific. I wonder if clang also hands out free lunches to GC code using marked interior pointers. That would be nice.
  • On May 12, 2012, at 12:31 PM, Quincey Morris wrote:

    > I think the difference is that for UTF8String, there is an API contract that promises the result will be an object (and it has the lifetime behavior of any returned object that is returned with +0 retain semantics, as the documentation warns).

    No; -[NSString UTF8String] returns a char*, not an object.

    The difference is that -UTF8String has to allocate new memory to hold the result, because it's not in the same format as the internal string data (which is either UTF-16 or MacRoman and not null-terminated.) I believe internally it creates an autoreleased NSData and returns a pointer to its -bytes.

    -mutableBytes doesn't (and shouldn't!) allocate anything; it just hands back a raw pointer to the NSData's bytes. But what it look like (as Ken said) is that the implementation of -mutableBytes calls [[self retain] autorelease] to avoid a situation where the caller releases the NSMutableData but still wants to use the bytes. The retain+autorelease ensures that the object and its bytes hang on until the inner autorelease pool is drained. In other words, I don't believe -mutableBytes allocates any data, but it does prolong the lifespan of its receiver.

    —Jens
  • On May 12, 2012, at 2:31 PM, Quincey Morris wrote:

    > On May 12, 2012, at 10:18 , Charles Srstka wrote:
    >
    >> Ugh, so you’re saying that ARC isn’t actually as deterministic as we’ve been led to believe?
    >
    > Indeterminism isn't the problem. Unmarked interior pointers *are*.

    But if the behavior is deterministic, you can predict where the NSMutableData will get dealloced, and so you’ll know for sure how long the interior pointer will be valid.

    Charles
  • On May 12, 2012, at 12:56 , Jens Alfke wrote:

    > No; -[NSString UTF8String] returns a char*, not an object.

    I meant "object" in the more general sense of a block of allocated memory whose lifetime is managed by a memory model. That includes (at least):

    1. Obj-C objects.

    2. CF…Ref objects.

    3. A few special cases such as UTF8String, where the block is (apparently) autoreleased to keep it from deallocating instantly.

    (In the GC case, these "objects" are blocks managed by the collector.)
  • On 2012-05-12, at 12:37 PM, Quincey Morris wrote:

    > P.S. I think there's also another, better solution, but it involves adding a method to NSData/NSMutableData via a category:
    >
    > - (void*) interiorBytes __attribute__ ((objc_returns_inner_pointer)) {
    > return self.bytes;
    > }
    >
    > - (void*) mutableInteriorBytes __attribute__ ((objc_returns_inner_pointer)) {
    > return self.mutableBytes;
    > }
    >
    > and never using bytes/mutableBytes directly ever again. Perhaps one day bytes/mutableBytes will themselves be marked this way.

    So when a method is declared __attribute__ ((objc_returns_inner_pointer)), then LLVM tracks regular pointers like it would NSObject pointers to see when the owning object can be dealloced? Just want to make sure I understand.

    Dave
  • On May 12, 2012, at 13:55 , Dave Fernandes wrote:

    > So when a method is declared __attribute__ ((objc_returns_inner_pointer)), then LLVM tracks regular pointers like it would NSObject pointers to see when the owning object can be dealloced? Just want to make sure I understand.

    … to see when the owning object's variable's compile time scope ends. Other retain/release optimizations aside, that would be the point where its retain count is decremented, and that might in turn trigger deallocation.

    But, yes, basically that's how I understand the documentation. It seems *way* too convenient to be true. ;)
  • Cool! That would eliminate some of the gymnastics I have been going through to make sure I reference the NSData object some time after I use the internal bytes.

    On 2012-05-12, at 5:05 PM, Quincey Morris wrote:

    > On May 12, 2012, at 13:55 , Dave Fernandes wrote:
    >
    >> So when a method is declared __attribute__ ((objc_returns_inner_pointer)), then LLVM tracks regular pointers like it would NSObject pointers to see when the owning object can be dealloced? Just want to make sure I understand.
    >
    > … to see when the owning object's variable's compile time scope ends. Other retain/release optimizations aside, that would be the point where its retain count is decremented, and that might in turn trigger deallocation.
    >
    > But, yes, basically that's how I understand the documentation. It seems *way* too convenient to be true. ;)
    >
    >
  • On May 12, 2012, at 2:31 PM, Quincey Morris wrote:

    > On May 12, 2012, at 10:17 , Ken Thomases wrote:
    >
    >> That's not necessarily so.  And/or requesting the mutableBytes may do the equivalent of retain+autorelease on the NSMutableData.
    >>
    >> Consider an inexact analog.  The -[NSString UTF8String] method seems to create an autoreleased NSData (or similar object) to hold the UTF-8-encoded C string that it returns.  It's not returning the NSData object, obviously, it's just using the autoreleased lifetime to manage the lifetime of the C string.
    >
    > I think the difference is that for UTF8String, there is an API contract that promises the result will be an object (and it has the lifetime behavior of any returned object that is returned with +0 retain semantics, as the documentation warns).
    >
    > For mutableBytes, there's no API contract that it's an object at all, or that it has any particular lifetime semantics if it is.

    Well, you're the one who asserted something very concrete about when the interior pointer was deallocated.  You were claiming that Andreas couldn't have seen what he said he saw.  My point is that in the absence of a contract you can't be sure.

    > If it was an object of some kind in Andrea's testing, that should be regarded as luck.

    Right, but Andreas wasn't looking to take advantage of the extended lifetime.  The extended lifetime was a problem for him.  He was asking if it made sense that he needed to use an autorelease pool to ensure the data's lifetime was *not* extended.  You asserted that it could not make sense.  I asserted that it might.

    Regards,
    Ken
  • On May 12, 2012, at 22:00 , Ken Thomases wrote:

    > Well, you're the one who asserted something very concrete about when the interior pointer was deallocated.  You were claiming that Andreas couldn't have seen what he said he saw.  My point is that in the absence of a contract you can't be sure.

    Sorry to have misunderstood your point.
  • On May 12, 2012, at 1:55 PM, Dave Fernandes <dave.fernandes...> wrote:
    > So when a method is declared __attribute__ ((objc_returns_inner_pointer)), then LLVM tracks regular pointers like it would NSObject pointers to see when the owning object can be dealloced? Just want to make sure I understand.

    In theory, the definition of objc_returns_inner_pointer allows ARC to track the returned pointer and release the owning object after the returned pointer is no longer used.

    In the current implementation, ARC simply retains and autoreleases the owning object and makes no attempt to track the returned pointer.

    You can suppress the extra retain/autorelease by using objc_precise_lifetime:

        void *bytes;
        {
            NSMutableData *obj __attribute__((objc_precise_lifetime)) = ...;
            // ARC preserves obj's strong reference until variable obj goes out of scope.

            bytes = [obj mutableBytes];
            // Because obj is marked precise-lifetime, ARC does not retain/autorelease the NSData object here.

            *bytes++;
            // bytes remains valid here without adding autorelease overhead
        }
        // bytes may be invalid here because obj is no longer in scope.

    --
    Greg Parker    <gparker...>    Runtime Wrangler
previous month may 2012 next month
MTWTFSS
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
Go to today