CoreData Best Practices
-
I'd like some advice on the best way to use CoreData in various
situations. I'm trying to create a BibTeX manager. The problem I'm
facing is that BibTeX allows user defined fields and multiple
authors. The obvious way to handle this is to store multiple authors
as an NSArray of NSStrings and the user defined fields in an
NSDictionary. However, these type aren't available as attributes in
CoreData. The best way I can think of to overcome this is to
serialize dictionaries and arrays and store them as data attributes.
This doesn't seem to be a solution and I wondered if anyone else had
a better one.
Thanks,
~Jim -
The thing to do is create separate 'author' entities and then add a
to-many relationship to the object in question. Let CoreData handle it.
It takes a bit of getting used to working with objects at this
granularity but it is worth it in the end.
On Apr 29, 2005, at 10:49 AM, James Clause wrote:
> I'd like some advice on the best way to use CoreData in various--
> situations. I'm trying to create a BibTeX manager. The problem I'm
> facing is that BibTeX allows user defined fields and multiple authors.
> The obvious way to handle this is to store multiple authors as an
> NSArray of NSStrings and the user defined fields in an NSDictionary.
> However, these type aren't available as attributes in CoreData. The
> best way I can think of to overcome this is to serialize dictionaries
> and arrays and store them as data attributes. This doesn't seem to be
> a solution and I wondered if anyone else had a better one.
John Brownlow
Deep Fried Films, Inc
http://www.johnbrownlow.com
http://www.pinkheadedbug.com -
On Apr 29, 2005, at 7:49 AM, James Clause wrote:
> I'd like some advice on the best way to use CoreData in various
> situations. I'm trying to create a BibTeX manager. The problem I'm
> facing is that BibTeX allows user defined fields and multiple authors.
> The obvious way to handle this is to store multiple authors as an
> NSArray of NSStrings and the user defined fields in an NSDictionary.
Use a to-many relationship for the authors. These are represented at
code level as NSMutableSets, since Core Data relationships have no
inherent order.
The best implementation I've found for user-defined fields is a
collection of CustomUserValue objects which have a relationship to
their parent object. You can try all day and night to use a dictionary
for this, but you'll only make things hard on yourself. In particular,
you'll limit your ability to do some fancy seaching stuff.
The CustomUserValue should look something like:
customValue - string
internalKey - string
displayKey - string
- Scott
--
http://treehouseideas.com/
http://theocacao.com/ [blog] -
If you decide you absolutely need an array of strings, there's a way to do
it:
Subclass NSManagedObject and add an array ivar
Make array retrieve/add/delete methods
Have keyed value that's a string or data
When the array changes, have it save its string representation or coded data
to the keyed value. If the array is requested when its ivar is nil, use
simply re-create it from the keyed value.
There's going to be a problem with large arrays, so this might be painfully
slow with the human genome publication, but it's fine for most instances.
JT
> I'd like some advice on the best way to use CoreData in various
> situations. I'm trying to create a BibTeX manager. The problem I'm
> facing is that BibTeX allows user defined fields and multiple
> authors. The obvious way to handle this is to store multiple authors
> as an NSArray of NSStrings and the user defined fields in an
> NSDictionary. However, these type aren't available as attributes in
> CoreData. The best way I can think of to overcome this is to
> serialize dictionaries and arrays and store them as data attributes.
> This doesn't seem to be a solution and I wondered if anyone else had
> a better one.
>
> Thanks,
> ~Jim
_______________________________________________
This mind intentionally left blank -
Sorry, didn't quite finish up my thought - got distracted by a phone call.
If you need to hold a more complex object (like a dictionary), model it as a
Core Data object, but include a field for a unique identifier (either string
or NSNumber). Store that unique identifier in the array, and use it to
retrieve the appropriate "author" entry.
The alternative to this would be to have a location field in the author
object, so that you could use it to sort the set.
Neither of these are great solutions, but both work pretty well.
JT
> If you decide you absolutely need an array of strings, there's a way to do_______________________________________________
> it:
>
> Subclass NSManagedObject and add an array ivar
> Make array retrieve/add/delete methods
> Have keyed value that's a string or data
>
> When the array changes, have it save its string representation or coded data
> to the keyed value. If the array is requested when its ivar is nil, use
> simply re-create it from the keyed value.
>
> There's going to be a problem with large arrays, so this might be painfully
> slow with the human genome publication, but it's fine for most instances.
>
This mind intentionally left blank -
On Apr 29, 2005, at 1:34 PM, James Clause wrote:
> I'd like some advice on the best way to use CoreData in various
> situations. I'm trying to create a BibTeX manager. The problem I'm
> facing is that BibTeX allows user defined fields and multiple
> authors. The obvious way to handle this is to store multiple authors
> as an NSArray of NSStrings and the user defined fields in an
> NSDictionary. However, these type aren't available as attributes in
> CoreData. The best way I can think of to overcome this is to
> serialize dictionaries and arrays and store them as data attributes.
> This doesn't seem to be a solution and I wondered if anyone else had
> a better one.
To store the authors:
Create an Author entity in your Core Data model and create a
relationship between your existing entity and the Author entity.
Create new Author entity instances, as needed, and relate them to
your -- for lack of a better term -- Paper entity, as needed. If
you are managing multiple Papers, it is likely that any one Author
may be an author of multiple Papers and, therefore, you should use a
fetch specification to grab an existing Author, if present, and
relate that existing Author instead of creating duplicate Authors.
With a reverse relationship from Author -> Paper, you could then ask
an Author for all of its Papers. Core Data will take care of
managing the inverse relationships automatically.
To store the key/value pairs:
Create a KeyValuePair entity that contains a 'key' attribute and a
'value' attribute. Then, create instances of those key/value pairs
and relate them to your Paper entities, as necessary.
Alternatively, you could store the values in a attribute marked as
Binary, then use a derived attribute to archive/unarchive the
dictionary/array into/out-of that attribute. This is suboptimal for
a number of reasons and not really recommended, but can be useful in
certain circumstances.
---
In general, when creating the data model for your application,
consider using managed objects for any place that you think you might
need an NSArray or NSDictionary. An NSArray implies a
relationship. NSDictionary implies an alternative form of data
storage and is not hard to push into Core Data.
b.bum -
So, I only bring up the following because I'm interested in seeing the best
solution the list can come up with -
Two issues regarding your advice that are specific to bibliographic
information:
Most sources of author information (ie - PubMed) don't track individual
authors, so he'll essentially have no way of identifying when two authors
are identical. There will probably be an author for every instance of
authorship. He could match first, last and initials to save some space, but
it's not clear where the performance of the search vs. memory use of
separate instances falls, especially once the author list gets very large.
The second thing is that, although relating authors to the paper is easy,
retaining the order of the multiple authors for a publication is absolutely
essential. There has to be some way of reconstructing an ordered list out
of that relationship set.
I came up with:
Stick a unique ID in every author, and maintain an array of those which, as
you said, is suboptimal.
Stick an order key in every author, and use that to sort an array created
out of the set. This seems better, but commits you to having a unique
author instance for every place an author appears.
I have the nagging sense there must be something better, but I haven't come
up with it.
> To store the authors:
>
> Create an Author entity in your Core Data model and create a
> relationship between your existing entity and the Author entity.
> Create new Author entity instances, as needed, and relate them to
> your -- for lack of a better term -- Paper entity, as needed. If
> you are managing multiple Papers, it is likely that any one Author
> may be an author of multiple Papers and, therefore, you should use a
> fetch specification to grab an existing Author, if present, and
> relate that existing Author instead of creating duplicate Authors.
>
> With a reverse relationship from Author -> Paper, you could then ask
> an Author for all of its Papers. Core Data will take care of
> managing the inverse relationships automatically.
>
> To store the key/value pairs:
>
> Create a KeyValuePair entity that contains a 'key' attribute and a
> 'value' attribute. Then, create instances of those key/value pairs
> and relate them to your Paper entities, as necessary.
>
> Alternatively, you could store the values in a attribute marked as
> Binary, then use a derived attribute to archive/unarchive the
> dictionary/array into/out-of that attribute. This is suboptimal for
> a number of reasons and not really recommended, but can be useful in
> certain circumstances.
>
> ---
>
> In general, when creating the data model for your application,
> consider using managed objects for any place that you think you might
> need an NSArray or NSDictionary. An NSArray implies a
> relationship. NSDictionary implies an alternative form of data
> storage and is not hard to push into Core Data.
>
> b.bum
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Cocoa-dev mailing list (<Cocoa-dev...>)
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/cocoa-dev/<jtimmer...>
>
> This email sent to <jtimmer...>
_______________________________________________
This mind intentionally left blank -
On Apr 29, 2005, at 3:29 PM, John Timmer wrote:
> Most sources of author information (ie - PubMed) don't track
> individual authors, so he'll essentially have no way of identifying
> when two authors are identical.
When we wrote such a tool, we parsed the pubmed author list, and
noted that Andrews J A could match a number of different authors.
Andrews J A was thus a separate author in our databases than John
Alan Andrews. A search for Andrews J A would return both.
> The second thing is that, although relating authors to the paper is
> easy, retaining the order of the multiple authors for a publication
> is absolutely essential. There has to be some way of
> reconstructing an ordered list out of that relationship set.
The best way I found was to store an index set.
Paper -> author_index_set ->> authors
If author_id is the unique author identifier, author_index_set rows
consist:
author_index_set_id
author_id
author_order
For CD/WO-like systems where the pk is not user-serviceable data, we
allow the system to add a seperate pk column. For other tools,
author_index_set_id and author_id act as a composite key.
Scott -
On Apr 29, 2005, at 3:29 PM, John Timmer wrote:
> Stick an order key in every author, and use that to sort an array
> created
> out of the set. This seems better, but commits you to having a unique
> author instance for every place an author appears.
It sounds like you need this anyway since earlier in your post you said
there are essentially no ways of identifying when two authors are
identical.
-- Chris -
Ah, finally I can ask the Core Data question that's been bugging me
for a long time.
I really like how easy it is to use NSPersistentDocument, and how
well-integrated it is with IB and Bindings. However, this class
assumes the standard 1-file, one NSDocument paradigm. It creates its
coordinator automatically, and is tied to a single file/URL.
What I'm curious about is how best to use the new CoreData/
persistence classes to emulate the typical database-driven approach,
where you have a single database file, but with with multiple
documents providing interfaces to it.
As an example, say I'm writing a program to keep track of Personnel.
The primary entities would be Person and Group. I'd like to store all
these entities in a single SQLite database using CoreData. However,
rather than using a single "document" with a master/detail interface,
I'd rather have one window showing a list of all Persons or Groups,
and a separate editor "document" window to edit each Person or Group.
These editor windows should behave just like normal a normal
NSPersistentDocument with regards to how it interacts with the store,
but I want all of the documents to read and write from the same
database file on disk. Changes made in one Person editor would not
affect those in any other editor until the document was saved (i.e.
the changes committed).
What's the best approach for handling a paradigm such as this?
Thanks,
- Paul M -
> On Apr 29, 2005, at 3:29 PM, John Timmer wrote:
>> Stick an order key in every author, and use that to sort an array
>> created
>> out of the set. This seems better, but commits you to having a unique
>> author instance for every place an author appears.
>
> It sounds like you need this anyway since earlier in your post you said
> there are essentially no ways of identifying when two authors are
> identical.
>
> -- Chris
Well, as I said there, you can match author's names. You wouldn't match
Hanson, C with Hanson, Chris even if they were the same person, but you
could match all instances of Hanson, C. Even if they weren't the same
person, they'd be indistinguishable at this level.
I did say that I'm not sure where that would come down on the
performance/memory use equation, though. I haven't done a multi-way search
on a managed object context with > 10000 objects yet. Anyone know how quick
that is?
JT
_______________________________________________
This mind intentionally left blank -
On Apr 29, 2005, at 4:03 PM, Paul Mix wrote:
> As an example, say I'm writing a program to keep track of Personnel.
> The primary entities would be Person and Group. I'd like to store all
> these entities in a single SQLite database using CoreData. However,
> rather than using a single "document" with a master/detail interface,
> I'd rather have one window showing a list of all Persons or Groups,
> and a separate editor "document" window to edit each Person or Group.
> These editor windows should behave just like normal a normal
> NSPersistentDocument with regards to how it interacts with the store,
> but I want all of the documents to read and write from the same
> database file on disk. Changes made in one Person editor would not
> affect those in any other editor until the document was saved (i.e.
> the changes committed).
>
> What's the best approach for handling a paradigm such as this?
If I understand what you're asking, the question doesn't really affect
Core Data per say. You only need one document and perhaps only one
nib. All you really need is two windows, and maybe one is an inspector.
If this is a single data file application (ala iTunes), then this is
super-easy because you can just bind your array controllers to the
AppDelegate's Managed Object Context and all of them will have access
to the same data.
- Scott
--
http://treehouseideas.com/
http://theocacao.com/ [blog] -
On Apr 29, 2005, at 4:13 PM, John Timmer wrote:
> I did say that I'm not sure where that would come down on the
> performance/memory use equation, though. I haven't done a multi-way
> search
> on a managed object context with > 10000 objects yet. Anyone know how
> quick
> that is?
If you're using NSPredicate with a NSQLiteStoreType, it should
theoretically be quite fast.
- Scott
--
http://treehouseideas.com/
http://theocacao.com/ [blog] -
On Apr 29, 2005, at 4:03 PM, Paul Mix wrote:
> What I'm curious about is how best to use the new CoreData/*If I understand correctly what you're after*, in general, although
> persistence classes to emulate the typical database-driven
> approach, where you have a single database file, but with with
> multiple documents providing interfaces to it.
>
Core Data does handle multiple concurrent access (to address one of
the worries from a week or two ago, it does use optimistic
locking...) this is not what Core Data is intended for. If you want
a database application, use a database.
> As an example, say I'm writing a program to keep track of
> Personnel. The primary entities would be Person and Group. I'd like
> to store all these entities in a single SQLite database using
> CoreData. However, rather than using a single "document" with a
> master/detail interface, I'd rather have one window showing a list
> of all Persons or Groups, and a separate editor "document" window
> to edit each Person or Group. These editor windows should behave
> just like normal a normal NSPersistentDocument with regards to how
> it interacts with the store, but I want all of the documents to
> read and write from the same database file on disk. Changes made in
> one Person editor would not affect those in any other editor until
> the document was saved (i.e. the changes committed).
> What's the best approach for handling a paradigm such as this?
>
Xcode already provides a template -- "Core Data Application"-- to
start down this path. It provides a single-window application where
the persistent store coordinator is set up and managed by an
application delegate. You could extract the part of the -
managedObjectContext method that creates the coordinator and use it
to implement a -coordinator method (as illustrated below). You are
free then to add as many managed object contexts as you wish...
NSManagedObjectContext *moc = [[NSManagedObjectContext alloc]
init];
[moc setPersistentStoreCoordinator:[[NSApp delegate] coordinator];
mmalc
AppDelegate ivar:
NSPersistentStoreCoordinator *coordinator;
- (NSPersistentStoreCoordinator *) persistentStoreCoordinator {
if (coordinator != nil) {
return coordinator;
}
NSError *error;
NSURL *url;
NSString *path = @"** path to your shared store **";
// check path exists -- perhaps create it if it doesn't, else
report an error
url = [NSURL fileURLWithPath: path];
coordinator = [[NSPersistentStoreCoordinator alloc]
initWithManagedObjectModel: [self managedObjectModel]];
if (![coordinator addPersistentStoreWithType:MY_STORE_TYPE
configuration:nil
URL:url
options:nil
error:&error])
{
[[NSApplication sharedApplication] presentError:error];
}
return coordinator;
} -
On Apr 29, 2005, at 4:13 PM, John Timmer wrote:
> I did say that I'm not sure where that would come down on the
> performance/memory use equation, though. I haven't done a multi-
> way search
> on a managed object context with > 10000 objects yet. Anyone know
> how quick
> that is?
>
For XML and Binary, the search will be performed in memory and should
be very quick, depending on the details of the predicate.
However, you can very likely gain a significant performance boost by
using a SQL store in that the query will be optimized down to a SQL
select statement and evaluated within the SQLite engine, which is
quite nicely optimized, itself.
We regularly tested and optimized Core Data against data sets in the
hundreds of thousands and millions of entities range.
If you do find a performance issue, please file a bug.
b.bum -
At 4:47 PM -0700 4/29/05, mmalcolm crawford wrote:
> On Apr 29, 2005, at 4:03 PM, Paul Mix wrote:
>
>> What I'm curious about is how best to use the new
>> CoreData/persistence classes to emulate the typical database-driven
>> approach, where you have a single database file, but with with
>> multiple documents providing interfaces to it.
>>
> *If I understand correctly what you're after*, in general, although
> Core Data does handle multiple concurrent access (to address one of
> the worries from a week or two ago, it does use optimistic
> locking...) this is not what Core Data is intended for. If you want
> a database application, use a database.
Well, I was looking into developing a custom OR-bridge for sqlite
when I first heard about CoreData, and it simply seemed to be a waste
of effort to re-invent the wheel (though I understand that CoreData
isn't really an ORB as much as a generalized modeling & persistence
tool).
> Xcode already provides a template -- "Core Data Application"-- to
> start down this path. It provides a single-window application where
> the persistent store coordinator is set up and managed by an
> application delegate. You could extract the part of the
> -managedObjectContext method that creates the coordinator and use it
> to implement a -coordinator method (as illustrated below). You are
> free then to add as many managed object contexts as you wish...
Thanks for the feedback. Allow me to revise my idiom a bit further. A
better analogy for the app I was planning would possibly be a
"Library" type of app, with two types of stores: a single,
centralized "Library" of media (books, cd's, etc.) (stored within the
app bundle or an Application Support directory), and multiple
"Personnel" document files (one for each person, that could be saved
and moved anywhere by the end-user). All details specific to each
Person would be stored in the "Person" document file, but would have
references to entities defined in the centralized "Library" database.
Your example above is pretty much what I'd planned on doing for the
Library file. Each media item in the Library would have its own
document/editor (with different editors for different media types),
each with its own context into the same coordinator. These "item"
editors would only be accessible by an "admin" (say, the librarian).
The "Person" documents would be editable by users. They would be able
to edit their personal details (name, address, etc.). They could
browse or check out items (i.e. making references to the entities in
the "Library"), but not directly edit their characteristics.
This splitting of Library and Person docs into separate files would
allow me to use file permissions to control access (since neither
CoreData nor sqlite support privileges that I'm aware of outside
actual file permissions).
My plan was to use an NSPersistentDocument for the "Person" documents
(one person, one file/store, one "document" types), but was unsure
how to handle the editors for the "Library" items (many items, one
file/store, multiple "document" types).
Am I totally barking up the wrong tree here?
Thanks,
- Paul -
On Apr 30, 2005, at 7:38 AM, Paul Mix wrote:
>> Xcode already provides a template -- "Core Data Application"-- toAgain *if* I understand correctly what you want, then it's reasonably
>> start down this path. It provides a single-window application
>> where the persistent store coordinator is set up and managed by an
>> application delegate. You could extract the part of the -
>> managedObjectContext method that creates the coordinator and use
>> it to implement a -coordinator method (as illustrated below). You
>> are free then to add as many managed object contexts as you wish...
>
> Thanks for the feedback. Allow me to revise my idiom a bit further.
> A better analogy for the app I was planning would possibly be a
> "Library" type of app, with two types of stores: a single,
> centralized "Library" of media (books, cd's, etc.) (stored within
> the app bundle or an Application Support directory), and multiple
> "Personnel" document files (one for each person, that could be
> saved and moved anywhere by the end-user). All details specific to
> each Person would be stored in the "Person" document file, but
> would have references to entities defined in the centralized
> "Library" database. [...]
> My plan was to use an NSPersistentDocument for the "Person"
> documents (one person, one file/store, one "document" types), but
> was unsure how to handle the editors for the "Library" items (many
> items, one file/store, multiple "document" types).
>
straightforward.
To edit the library, you could create a "Core Data Application".
You need change the path to the data store to point wherever is
appropriate (per - (NSString *)applicationSupportFolder), and then
you could just use the single application-wide editing context the
template creates for you and reference it from as many window
controllers as you wish.
[data store] -- [p.s.c.] -- [m.o.c.] -- [window controller] (books)
(file) |---- [window controller] (CDs)
|---- [window controller] (DVDs)
The downside to this (singe context) approach that you probably
anticipated is that is gives you a single "editing context" or
scratch pad (to use the analogy here: <http://developer.apple.com/
documentation/Cocoa/Conceptual/CoreData/Articles/cdBasics.html>). If
you make edits to instances of two type of media in two separate
windows, then a 'save' commits changes in both.
The advantage to the architecture you suggested, and outlined in the
previous reply, is that you're able to keep changes discrete per
window, and save independently:
[data store] -- [p.s.c.] -- [m.o.c.] -- [window controller] (books)
(file) |----- [m.o.c.] -- [window controller] (CDs)
|----- [m.o.c.] -- [window controller] (DVDs)
A potential disadvantage here is that if there are any relationships
between books, CDs, and/or DVDs, then you may at some stage have a
situation where changes made in one window are inconsistent with
changes recently saved by another, and a user may unwittingly
overwrite those changes. Optimistic locking won't help you here
because you have a single persistence stack. This is a rather subtle
point...
For the Person documents, you can create a standard "Core Data
Document-based Application", but configure the persistence stack so
that in addition to the document's store you also use the library store:
[document file] -- [p.s.c.] -- [m.o.c.] -- [document controller]
[library] ----|
There are two places in which you need to configure the stack. The
obvious one is the aptly-named:
- (BOOL)configurePersistentStoreCoordinatorForURL:(NSURL *)url ofType:
(NSString *)fileType error:(NSError **)error
You can override the method to add a persistent store to the
coordinator:
- (BOOL)configurePersistentStoreCoordinatorForURL:(NSURL *)url ofType:
(NSString *)fileType error:(NSError **)error
{
if (![super configurePersistentStoreCoordinatorForURL:url
ofType:fileType error:error])
{
return NO;
}
NSPersistentStoreCoordinator *coordinator =
[[self managedObjectContext] persistentStoreCoordinator];
libraryStore =
[coordinator addPersistentStoreWithType:STORE_TYPE
configuration:nil
URL:url // path library
options:nil
error:&error];
// implementation continues...
Note, however, that this method is invoked when an *existing*
document is first opened and when a *new* document is first saved.
Clearly, then, overriding this method isn't much help for new
documents... To address this issue, you also need to override:
- (id)initWithType:(NSString *)type error:(NSError **)error;
It should add the persistent store in the same way as configure...
So: (a) It may be worth factoring this code into a separate method;
(b) In your configure... method, or the factored method, you need to
ensure that the store hasn't already been added.
- (BOOL)addLibraryStore:(NSError **)error
{
NSURL *url = // URL for your library
NSPersistentStoreCoordinator * coordinator =
[[self managedObjectContext] persistentStoreCoordinator];
id store = [coordinator persistentStoreForURL: url];
if (store == nil) {
store =
[coordinator addPersistentStoreWithType:STORE_TYPE
configuration:nil
URL:url
options:nil
error:error];
if (store == nil) {
return NO;
}
}
return YES;
}
Note that after all this, the biggest issue you face will I suspect
be creating cross-store relationships (more anon)...
mmalc
(All code typed in Mail, so usual caveats apply.)



