NSPropertyListSerialization/plutil bloat in Tiger vs Jaguar

  • Ok, this is bizarre...

    We bundle some data with our application, it's very simple and way
    back when we decided to go with the plist format since it was so
    straight-forward and easy to write/read to. Our data has a root
    dictionary with 931 arrays in it, each array having 999 booleans.  so
    something like

    <dict>
    <key>005</key>
    <array>
      <false/>
      <false/>
      <false/>
      <false/>
      <true/>

    ...

      <false/>
      <false/>
      <false/>
      <false/>
    </array>
    <key>006</key>
    <array>
      <false/>
      <false/>
      <false/>

    ...

    To generate this plist, we read in a delimited text file, loop
    through the records, create the dict, then write it out with the data
    from this:

    [NSPropertyListSerialization dataFromPropertyList:zipDict
    format:NSPropertyListBinaryFormat_v1_0 errorDescription:&error];

    i wrote a small tool to do this, and would run it on our input file,
    it would create binary plists, all was good.  So we have a new input
    file today, i run it on Tiger (first time running the tool since
    Jaguar days) and I get a huge file, it's 6.2 megs. On Jaguar, same
    exact tool, same exact input file, it's 76k. I tried on Panther and
    got the 6.2 meg file as well. I used plutil to convert both the
    binary files to XML, and diff'd them and they are the same.  Running
    'plutil -convert binary1 thefile-xml.plist' on Tiger gives me a 6.2
    meg file, same plutil command on Jaguar gives me a 76k file, from the
    same XML file.

    I tested our app with the 76k file and all seems well, but I'm
    worried I'm missing something, especially since I don't plan to have
    a Jaguar machine for ever, and bundling a 76k file vs a 6 meg file
    with a 10 meg app is quite a difference. And what is all this extra
    data?

    I tried using NSKeyedArchiver and NSArchiver and they both made large
    files as well. So in the meantime, I'm going to stick with my
    Jaguar-made plist.

    I can make sample code and input/output files available if you have
    any thoughts. Ideally, I'd like to be able to make the smaller-style
    binary plist in code from my tool.

    Thanks!

    -aaron
  • On Dec 28, 2005, at 2:35 AM, Aaron Tuller wrote:

    > Ok, this is bizarre...
    >
    > We bundle some data with our application, it's very simple and way
    > back when we decided to go with the plist format since it was so
    > straight-forward and easy to write/read to. Our data has a root
    > dictionary with 931 arrays in it, each array having 999 booleans.
    > so something like
    > [...]
    >
    > i wrote a small tool to do this, and would run it on our input
    > file, it would create binary plists, all was good.  So we have a
    > new input file today, i run it on Tiger (first time running the
    > tool since Jaguar days) and I get a huge file, it's 6.2 megs. On
    > Jaguar, same exact tool, same exact input file, it's 76k. I tried
    > on Panther and got the 6.2 meg file as well. I used plutil to
    > convert both the binary files to XML, and diff'd them and they are
    > the same.  Running 'plutil -convert binary1 thefile-xml.plist' on
    > Tiger gives me a 6.2 meg file, same plutil command on Jaguar gives
    > me a 76k file, from the same XML file.

    In Jaguar (10.2), the binary plist writing machinery uniqued
    [identical] arrays -- and the other plist objects except dictionaries
    -- which is why the file was small.  But we got so many developer
    complaints that this was SLOW that we stopped uniquing arrays too in
    Panther (10.3).  There were more changes in Tiger (10.4), and we're
    probably not done yet.  The object uniquing step was to make the
    files smaller, and you must have a lot of identical arrays and so
    it's a dramatic difference in your case.  However, it is also the
    performance bottleneck in the writing process.  And the bottom line
    is that developers care about speed much much more than size.

    Chris Kane
    Cocoa Frameworks, Apple
previous month december 2005 next month
MTWTFSS
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  
Go to today