Core Data: different fetch performance when launching app

  • I have a Core Data app. When I run the app the first time after a
    reboot it takes a long time (20 seconds) to launch. When I restart
    the application the launch time is reduced to 2 seconds.

    When | enable sqldebug I see the following:

    First launch:

    CoreData: sql: SELECT S.Z_ENT, S.Z_PK, S.Z_OPT, S.ZCOMMENTTEXT,
    S.ZICONDATA, S.ZCREATIONDATE, S.ZTOTALBYTES, S.ZLASTMODIFIEDDATE,
    S.ZFOLDERCOUNT, S.ZFILECOUNT, S.ZISPACKAGE, S.ZFULLPATH, S.ZNAME,
    S.ZUUID, S.ZISCONTAINER, S.ZPARENT, S.Z4_PARENT, S.ZCATEGORY,
    S.ZREMOVABLE, S.ZLOCATION, S.ZMEDIATYPEDESCRIPTION,
    S.ZFILESYSTEMTYPENAME, S.ZFREEBYTES, S.ZWRITABLE, S.ZKIND,
    S.ZFILETYPECODE, S.ZCREATORCODE, S.ZCREATOR, S.ZFILEATTRIBUTES,
    S.ZUTI, S.ZGROUP, S.ZINARCHIVE, S.ZPARTOFCATALOG FROM ZSTORAGEITEM S
    WHERE  S.Z_ENT = ? ORDER BY  S.ZNAME

    CoreData: annotation: fetch execution time: 12.439718s

    Launching the application a second time:

    CoreData: sql: SELECT S.Z_ENT, S.Z_PK, S.Z_OPT, S.ZCOMMENTTEXT,
    S.ZICONDATA, S.ZCREATIONDATE, S.ZTOTALBYTES, S.ZLASTMODIFIEDDATE,
    S.ZFOLDERCOUNT, S.ZFILECOUNT, S.ZISPACKAGE, S.ZFULLPATH, S.ZNAME,
    S.ZUUID, S.ZISCONTAINER, S.ZPARENT, S.Z4_PARENT, S.ZCATEGORY,
    S.ZREMOVABLE, S.ZLOCATION, S.ZMEDIATYPEDESCRIPTION,
    S.ZFILESYSTEMTYPENAME, S.ZFREEBYTES, S.ZWRITABLE, S.ZKIND,
    S.ZFILETYPECODE, S.ZCREATORCODE, S.ZCREATOR, S.ZFILEATTRIBUTES,
    S.ZUTI, S.ZGROUP, S.ZINARCHIVE, S.ZPARTOFCATALOG FROM ZSTORAGEITEM S
    WHERE  S.Z_ENT = ? ORDER BY  S.ZNAME

    CoreData: annotation: fetch execution time: 0.161268s

    I use a SQLLite persistent store.

    Is this normal behavior? It looks like Core Data caches the fetch/
    data for the current login session.

    Diederik
  • On 13/10/2006, at 13.17, Diederik Hoogenboom wrote:

    > I have a Core Data app. When I run the app the first time after a
    > reboot it takes a long time (20 seconds) to launch. When I restart
    > the application the launch time is reduced to 2 seconds.

    This is quite normal. It is the OS disk cache.

    It is possible that SQLite's disk access pattern is more sensitive to
    caching issues compared to reading a file sequentially.
  • Jakob,

    That would make sense if it was a really large SQLLite file. The file
    is only 1,5 MB. 12 seconds is a really long time in SQLLite terms.

    Diederik

    On 13-okt-2006, at 13:47, Jakob Olesen wrote:

    >
    > On 13/10/2006, at 13.17, Diederik Hoogenboom wrote:
    >
    >> I have a Core Data app. When I run the app the first time after a
    >> reboot it takes a long time (20 seconds) to launch. When I restart
    >> the application the launch time is reduced to 2 seconds.
    >
    > This is quite normal. It is the OS disk cache.
    >
    > It is possible that SQLite's disk access pattern is more sensitive
    > to caching issues compared to reading a file sequentially.
    >
  • Do you guys think batch faulting can help here?

    Diederik

    On 13-okt-2006, at 13:17, Diederik Hoogenboom wrote:

    > I have a Core Data app. When I run the app the first time after a
    > reboot it takes a long time (20 seconds) to launch. When I restart
    > the application the launch time is reduced to 2 seconds.
    >
    > When | enable sqldebug I see the following:
    >
    >
    > First launch:
    >
    > CoreData: sql: SELECT S.Z_ENT, S.Z_PK, S.Z_OPT, S.ZCOMMENTTEXT,
    > S.ZICONDATA, S.ZCREATIONDATE, S.ZTOTALBYTES, S.ZLASTMODIFIEDDATE,
    > S.ZFOLDERCOUNT, S.ZFILECOUNT, S.ZISPACKAGE, S.ZFULLPATH, S.ZNAME,
    > S.ZUUID, S.ZISCONTAINER, S.ZPARENT, S.Z4_PARENT, S.ZCATEGORY,
    > S.ZREMOVABLE, S.ZLOCATION, S.ZMEDIATYPEDESCRIPTION,
    > S.ZFILESYSTEMTYPENAME, S.ZFREEBYTES, S.ZWRITABLE, S.ZKIND,
    > S.ZFILETYPECODE, S.ZCREATORCODE, S.ZCREATOR, S.ZFILEATTRIBUTES,
    > S.ZUTI, S.ZGROUP, S.ZINARCHIVE, S.ZPARTOFCATALOG FROM ZSTORAGEITEM
    > S WHERE  S.Z_ENT = ? ORDER BY  S.ZNAME
    >
    > CoreData: annotation: fetch execution time: 12.439718s
    >
    >
    > Launching the application a second time:
    >
    > CoreData: sql: SELECT S.Z_ENT, S.Z_PK, S.Z_OPT, S.ZCOMMENTTEXT,
    > S.ZICONDATA, S.ZCREATIONDATE, S.ZTOTALBYTES, S.ZLASTMODIFIEDDATE,
    > S.ZFOLDERCOUNT, S.ZFILECOUNT, S.ZISPACKAGE, S.ZFULLPATH, S.ZNAME,
    > S.ZUUID, S.ZISCONTAINER, S.ZPARENT, S.Z4_PARENT, S.ZCATEGORY,
    > S.ZREMOVABLE, S.ZLOCATION, S.ZMEDIATYPEDESCRIPTION,
    > S.ZFILESYSTEMTYPENAME, S.ZFREEBYTES, S.ZWRITABLE, S.ZKIND,
    > S.ZFILETYPECODE, S.ZCREATORCODE, S.ZCREATOR, S.ZFILEATTRIBUTES,
    > S.ZUTI, S.ZGROUP, S.ZINARCHIVE, S.ZPARTOFCATALOG FROM ZSTORAGEITEM
    > S WHERE  S.Z_ENT = ? ORDER BY  S.ZNAME
    >
    > CoreData: annotation: fetch execution time: 0.161268s
    >
    >
    > I use a SQLLite persistent store.
    >
    > Is this normal behavior? It looks like Core Data caches the fetch/
    > data for the current login session.
    >
    >
    > Diederik
    > _______________________________________________
    > Do not post admin requests to the list. They will be ignored.
    > Cocoa-dev mailing list      (<Cocoa-dev...>)
    > Help/Unsubscribe/Update your Subscription:
    > http://lists.apple.com/mailman/options/cocoa-dev/mailings%
    > 40obviousmatter.com
    >
    > This email sent to <mailings...>
  • On Oct 13, 2006, at 5:57 AM, Diederik Hoogenboom wrote:

    > That would make sense if it was a really large SQLLite file. The
    > file is only 1,5 MB. 12 seconds is a really long time in SQLLite
    > terms.

    How many records are there?

        - Scott
  • Total records in the store is about 30,000. There are relationships
    as well. Everything is bound to an arraycontroller and
    treecontroller. The entity StorageItem it is querying is the
    superclass of all the entities.
    StorageItem has a tree structure; it point to itself via a
    relationship called 'items'. There is also an inverse called
    parentItem. There is a subclass of StorageItem called Catalog which
    objects are the top objects of the trees (and have some additional
    properties). When the app starts it displays (NSTableView for the
    Catalog objects and NSOutlineView for the related (items
    relationship) StorageItem objects). It should only query all Catalog
    objects (about 10 records) and the first level of the StorageItems
    (about 12 records), right? It doesn't have to go through the whole
    object graph I guess.

    Diederik

    On 14-okt-2006, at 2:52, Scott Stevenson wrote:

    >
    > On Oct 13, 2006, at 5:57 AM, Diederik Hoogenboom wrote:
    >
    >> That would make sense if it was a really large SQLLite file. The
    >> file is only 1,5 MB. 12 seconds is a really long time in SQLLite
    >> terms.
    >
    > How many records are there?
    >
    > - Scott
    > _______________________________________________
    > Do not post admin requests to the list. They will be ignored.
    > Cocoa-dev mailing list      (<Cocoa-dev...>)
    > Help/Unsubscribe/Update your Subscription:
    > http://lists.apple.com/mailman/options/cocoa-dev/mailings%
    > 40obviousmatter.com
    >
    > This email sent to <mailings...>
  • On 14/10/2006, at 9.36, Diederik Hoogenboom wrote:

    > Total records in the store is about 30,000. There are relationships
    > as well. Everything is bound to an arraycontroller and
    > treecontroller. The entity StorageItem it is querying is the
    > superclass of all the entities.
    > StorageItem has a tree structure; it point to itself via a
    > relationship called 'items'. There is also an inverse called
    > parentItem. There is a subclass of StorageItem called Catalog which
    > objects are the top objects of the trees (and have some additional
    > properties). When the app starts it displays (NSTableView for the
    > Catalog objects and NSOutlineView for the related (items
    > relationship) StorageItem objects). It should only query all
    > Catalog objects (about 10 records) and the first level of the
    > StorageItems (about 12 records), right? It doesn't have to go
    > through the whole object graph I guess.

    When you create an entity inheritance hierarchy, Core Data merges the
    data for subentities into the root entity table, so in your case all
    your entity instances are in the same table. The Z_ENT attribute is
    like the isa pointer, it identifies the entity type.

    Your SQL query for the Catalog items is essentially this:

    SELECT * FROM ZSTORAGEITEM S WHERE  S.Z_ENT = ? ORDER BY  S.ZNAME

    There is no index on Z_ENT, so SQLite has to do a full table scan to
    find the Catalog instances. It has to look at all 30,000 records, and
    so it needs to read all the blocks of your database file, even if it
    only returns 10 records. There is a good chance it won't be reading
    the blocks sequentially, and that can be really slow when they are
    not cached yet.

    Once the blocks are cached (by both the kernel and SQLite internally)
    a full table scan of 1.5MB is not a big issue.

    There are several ways you can speed up your initial query:

    1. Break up your inheritance hierarchy, so you get more tables. If
    your Catalog instances are in a smaller table, you get them faster.

    2. Use a relationship to fetch the Catalog instances. Relationships
    are indexed, so you avoid the full table scan.

    3. Force the kernel to cache your database file by reading it
    sequentially before the fetch.

    If your Catalog instances are tree roots with a null parentItem, try
    using that in your fetch predicate: "parentItem=nil". Since
    parentItem is a relationship, it is indexed, and you avoid the full
    table scan. That is the simplest solution (if it works...)
  • On 10/14/06 2:26 PM, "Jakob Olesen" <stoklund...> wrote:

    Hi All,

    > There are several ways you can speed up your initial query:
    >
    > 1. Break up your inheritance hierarchy, so you get more tables. If
    > your Catalog instances are in a smaller table, you get them faster.
    >
    > 2. Use a relationship to fetch the Catalog instances. Relationships
    > are indexed, so you avoid the full table scan.
    >
    > 3. Force the kernel to cache your database file by reading it
    > sequentially before the fetch.
    >
    > If your Catalog instances are tree roots with a null parentItem, try
    > using that in your fetch predicate: "parentItem=nil". Since
    > parentItem is a relationship, it is indexed, and you avoid the full
    > table scan. That is the simplest solution (if it works...)

    I always have wonder to such advices.

    While CoreData is intended to "simplify" job of developer,
    letters on this list show that developers quite often are forced:

    a) learn deeply internal structure of CoreData
    b) learn not easy tricks & tips how to fight with simple problems.
    c) spend time and efforts to redesign their SIMPLE db structure to something
    complex to satisfy logic of CoreData
    D) etc

    Diederik, have ONE table, with only 30K records and *tiny* db file.
    On powerful modern MACs it takes 2-20 seconds to work?
    Guys, open that file in HexEdit - it will scan it in no time.

    If a technology is not able to handle such EASY db structure
    and such small amount of data then is it good?

    P.S. This is why I do not think that OO DBMS/Layers is a good way.
    This is why we push our own database Valentina into OR hybrid way to get
    best of both ... no ... of FEW worlds.

    P.S.2. no need to start religious wars here :-)
        Just express my opinion about "easy for use" technologies
        that require long time of deep learning...

    For example it was fun to see in .NET 2.0 Garbage collection manager that
    simplify developer job, only you must remember:

    1) .....
    2) .....
    3) .....
    4) .....
    5) .....
    6) .....
    7) .....
    8) .....
    9) .....
    .....

    --
    Best regards,

    Ruslan Zasukhin
    VP Engineering and New Technology
    Paradigma Software, Inc

    Valentina - Joining Worlds of Information
    http://www.paradigmasoft.com

    [I feel the need: the need for speed]
  • Jakob,

    Thanks! Your last remark did the trick. I just added parentItem ==
    nil to the predicate of the arraycontroller and the fetch now take
    0.156 seconds the first time.

    I would have expected that Z_ENT would be indexed. It's one of the
    primary columns that are always there. Is there a way to add an index
    to a table in the SQLLite store, other then creating an relationship?

    Kind regards,

    Diederik

    On 14-okt-2006, at 13:26, Jakob Olesen wrote:

    >
    > On 14/10/2006, at 9.36, Diederik Hoogenboom wrote:
    >
    >> Total records in the store is about 30,000. There are
    >> relationships as well. Everything is bound to an arraycontroller
    >> and treecontroller. The entity StorageItem it is querying is the
    >> superclass of all the entities.
    >> StorageItem has a tree structure; it point to itself via a
    >> relationship called 'items'. There is also an inverse called
    >> parentItem. There is a subclass of StorageItem called Catalog
    >> which objects are the top objects of the trees (and have some
    >> additional properties). When the app starts it displays
    >> (NSTableView for the Catalog objects and NSOutlineView for the
    >> related (items relationship) StorageItem objects). It should only
    >> query all Catalog objects (about 10 records) and the first level
    >> of the StorageItems (about 12 records), right? It doesn't have to
    >> go through the whole object graph I guess.
    >
    > When you create an entity inheritance hierarchy, Core Data merges
    > the data for subentities into the root entity table, so in your
    > case all your entity instances are in the same table. The Z_ENT
    > attribute is like the isa pointer, it identifies the entity type.
    >
    > Your SQL query for the Catalog items is essentially this:
    >
    > SELECT * FROM ZSTORAGEITEM S WHERE  S.Z_ENT = ? ORDER BY  S.ZNAME
    >
    > There is no index on Z_ENT, so SQLite has to do a full table scan
    > to find the Catalog instances. It has to look at all 30,000
    > records, and so it needs to read all the blocks of your database
    > file, even if it only returns 10 records. There is a good chance it
    > won't be reading the blocks sequentially, and that can be really
    > slow when they are not cached yet.
    >
    > Once the blocks are cached (by both the kernel and SQLite
    > internally) a full table scan of 1.5MB is not a big issue.
    >
    > There are several ways you can speed up your initial query:
    >
    > 1. Break up your inheritance hierarchy, so you get more tables. If
    > your Catalog instances are in a smaller table, you get them faster.
    >
    > 2. Use a relationship to fetch the Catalog instances. Relationships
    > are indexed, so you avoid the full table scan.
    >
    > 3. Force the kernel to cache your database file by reading it
    > sequentially before the fetch.
    >
    > If your Catalog instances are tree roots with a null parentItem,
    > try using that in your fetch predicate: "parentItem=nil". Since
    > parentItem is a relationship, it is indexed, and you avoid the full
    > table scan. That is the simplest solution (if it works...)
    >
    >
  • On 14/10/2006, at 15.45, Diederik Hoogenboom wrote:

    > Thanks! Your last remark did the trick. I just added parentItem ==
    > nil to the predicate of the arraycontroller and the fetch now take
    > 0.156 seconds the first time.

    Lucky guess ;-)

    > I would have expected that Z_ENT would be indexed. It's one of the
    > primary columns that are always there. Is there a way to add an
    > index to a table in the SQLLite store, other then creating an
    > relationship?

    No.

    You can go behind Core Data's back and add the index yourself using
    the sqlite3 command line tool or the SQLite API, but I wouldn't
    recommend that.

    Z_ENT is a low cardinality attribute. It only takes a few distinct
    values across your 30,000 records. Such attributes normally don't
    benefit from being indexed, and each index has a performance penalty
    when inserting records. That is probably why Core Data doesn't index
    that attribute.

    It would be a nice addition to Core Data to be able to add indices.
    Currently, every fetch that doesn't use a relationship or an object
    ID causes a full table scan.
  • Jakob,

    Fetch request using relationships can be tricky as well. Some time
    ago I tried to write a predicate for a fetch request to retrieve
    objects based on the value of attributes of the relationship, like this:

    Example Object Model:

    Car -> Wheel

    A car can have one or more wheels.  And a wheel has a property called
    color (to keep it simple an NSString). The relationship is called
    wheels. There is no inverse relationship.
    What I want is to retrieve all cars that a color that contains the
    word 'black'.

    Code I've tried so far:

    NSEntityDescription *entityDescription = [NSEntityDescription
    entityForName:@"Car" inManagedObjectContext:context];
    NSFetchRequest *request = [[[NSFetchRequest alloc] init] autorelease];
    [request setEntity:entityDescription];
    NSPredicate *predicate = [NSPredicate predicateWithFormat:[NSString
    stringWithFormat:@"ANY wheels.color like '*black*'] ];
    [request setPredicate:predicate];
    NSArray *array = [context executeFetchRequest:request error:nil];

    Unfortunately the fetch request does result in any objects retrieved.

    Diederik

    On 14-okt-2006, at 16:25, Jakob Olesen wrote:

    >
    > On 14/10/2006, at 15.45, Diederik Hoogenboom wrote:
    >
    >> Thanks! Your last remark did the trick. I just added parentItem ==
    >> nil to the predicate of the arraycontroller and the fetch now take
    >> 0.156 seconds the first time.
    >
    > Lucky guess ;-)
    >
    >> I would have expected that Z_ENT would be indexed. It's one of the
    >> primary columns that are always there. Is there a way to add an
    >> index to a table in the SQLLite store, other then creating an
    >> relationship?
    >
    > No.
    >
    > You can go behind Core Data's back and add the index yourself using
    > the sqlite3 command line tool or the SQLite API, but I wouldn't
    > recommend that.
    >
    > Z_ENT is a low cardinality attribute. It only takes a few distinct
    > values across your 30,000 records. Such attributes normally don't
    > benefit from being indexed, and each index has a performance
    > penalty when inserting records. That is probably why Core Data
    > doesn't index that attribute.
    >
    > It would be a nice addition to Core Data to be able to add indices.
    > Currently, every fetch that doesn't use a relationship or an object
    > ID causes a full table scan.
    >
  • On Oct 14, 2006, at 5:23 AM, Ruslan Zasukhin wrote:

    > I always have wonder to such advices.
    > While CoreData is intended to "simplify" job of developer
    [...]
    > P.S.2. no need to start religious wars here :-)
    > Just express my opinion about "easy for use" technologies
    > that require long time of deep learning...

    I agree, but I just want to clarify that "simplify" and "make very
    simple" are different things. Persistence is hard to get right no
    matter how you slice it. Everybody's app is a little different.

    Core Data takes something that is hard to do on your own, and makes
    it easier for a broad range of cases. That doesn't mean it requires
    zero learning or that it helps *all* cases, just that's it's easier
    for most cases.

    If all you want to do is fetch values from a database without any of
    the extras, using the raw SQLite API might be better than Core Data.
    A lot of apps need more than that, though.

        - Scott
previous month october 2006 next month
MTWTFSS
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
Go to today