How best to archive in CSV format

  • Hi

    I'm looking for advice on the best way to handle archiving my documents
    in csv (comma separated variable) format.

    I have written a small Coca application to be used for data capture.
    The program accepts text and numbers typed into an NSTableView and
    stores them in an array of objects. Following the document-based
    application examples in Aaaron Hillegass' excellent book my app is able
    to archive the array to disk in a coded format.  I have implemented
    encodeWithCoder and initWithCoder methods on the class that is stored
    in the array.  My MyDocument class has dataRepresentationOfType: and
    loadDtaRepresentation:ofType: methods which use NSkeyedArchiver to save
    or retrieve documents.

    That all works very nicely (thanks Aaron) but in order to be useful I
    need to change the document storage format to simple csv.  Can I do
    this through the encodeWithCoder initWithCoder mechanism?  This seems
    to me to be the logical place to write routines for transforming
    objects to and from a disk file but maybe I'm misunderstanding the
    function of NSCoder.  Is it sensible to write a coder that converts
    objects to UTF strings?

    What I have at the moment is an extra method inside my object that
    returns a string representation of the object by appending a
    description of each item to an NSMutableString.  Then I have replaced
    the NSKeyedArchiver part of dataRepresentationOfType: with a loop that
    retrieves this string from each item in the array and concatenates it
    into a bigger string which is then returned with
      return [string dataUsingEncoding: NSUTF8StringEncoding];
    The encodeWithCoder  method is not being used.

    This works, but it looks clumsy. I'm having doubts that it will scale
    as the string will grow very large.  The init method for
    NSMutableString requires that I specify a capacity which it says is a
    "hint" of how much memory to allocate.  I don't know how binding (no,
    not the Cooca sort of binding) a "hint" is.  Am I free to exceed the
    capacity, or do I have to anticipate the maximum string size?  These
    doubts make me think I'm going about this all wrong.  Should I be using
    the encode and decode system to convert to and from csvs format?

    regards

    Denis

    Denis Stanton
    Orcon Internet Limited
    (09) 480 9299
    http://www.orcon.net.nz
  • On May 17, 2005, at 4:51 AM, Denis Stanton wrote:

    > Hi
    >
    > I'm looking for advice on the best way to handle archiving my
    > documents in csv (comma separated variable) format.
    >
    > These doubts make me think I'm going about this all wrong.  Should
    > I be using the encode and decode system to convert to and from csvs
    > format?

    Look at the references document to NSString and NSArray since they
    have some methods for working with delimited data.  You will find a
    couple of methods for working with making and parsing strings from
    arrays of strings and setting the delimiter to whatever you want,
    (NSString)componentsSeparatedByString: and (NSArray)
    componentsJoinedByString:.  Also, you will find read and write
    methods for creating files.

    Brian
  • On May 16, 2005, at 9:37 PM, Brian Smith wrote:

    >
    > On May 17, 2005, at 4:51 AM, Denis Stanton wrote:
    >
    >
    >> I'm looking for advice on the best way to handle archiving my
    >> documents in csv (comma separated variable) format.
    >>
    >> These doubts make me think I'm going about this all wrong.
    >> Should I be using the encode and decode system to convert to and
    >> from csvs format?
    >>
    >
    > Look at the references document to NSString and NSArray since they
    > have some methods for working with delimited data.  You will find a
    > couple of methods for working with making and parsing strings from
    > arrays of strings and setting the delimiter to whatever you want,
    > (NSString)componentsSeparatedByString: and (NSArray)
    > componentsJoinedByString:.  Also, you will find read and write
    > methods for creating files.

    That sounds like a really bad idea because you need to deal with
    quoting and such.  You'll have to code this up by hand unless you can
    find existing C or Objective-C code to do it.  NSStringScanner might
    help.

    One alternative is to look at PyObjC, because Python ships with a csv
    module.. though it's not optimal since current versions have a
    limitation such that it only knows how to deal with bytestrings, not
    unicode, so you have to encode everything into utf-8 before putting
    it into the csv and decode it from utf-8 after getting it out.

    In other words, it's time to decide whether you REALLY need CSV :)

    -bob
  • On May 17, 2005, at 1:43 PM, Bob Ippolito wrote:

    >
    > On May 16, 2005, at 9:37 PM, Brian Smith wrote:
    >
    >>
    >> On May 17, 2005, at 4:51 AM, Denis Stanton wrote:
    >>
    >>
    >>> I'm looking for advice on the best way to handle archiving my
    >>> documents in csv (comma separated variable) format.
    >>>
    >>> These doubts make me think I'm going about this all wrong.  Should
    >>> I be using the encode and decode system to convert to and from csvs
    >>> format?
    >>>
    >>
    >> Look at the references document to NSString and NSArray since they
    >> have some methods for working with delimited data.  You will find a
    >> couple of methods for working with making and parsing strings from
    >> arrays of strings and setting the delimiter to whatever you want,
    >> (NSString)componentsSeparatedByString: and
    >> (NSArray)componentsJoinedByString:.  Also, you will find read and
    >> write methods for creating files.
    >
    > That sounds like a really bad idea because you need to deal with
    > quoting and such.  You'll have to code this up by hand unless you can
    > find existing C or Objective-C code to do it.  NSStringScanner might
    > help.
    >
    > One alternative is to look at PyObjC, because Python ships with a csv
    > module.. though it's not optimal since current versions have a
    > limitation such that it only knows how to deal with bytestrings, not
    > unicode, so you have to encode everything into utf-8 before putting it
    > into the csv and decode it from utf-8 after getting it out.
    >
    > In other words, it's time to decide whether you REALLY need CSV :)
    >
    > -bob

    Thanks Brian and Bob

    I have a reasonable idea of how to do the string handling parts with
    componentsSeparatedByString and componentsJoinedByString.

    That's not the part that worries me.  My question whether I should be
    writing this CSV conversion stuff inside the standard methods
    encodeWithCoder and initWithCoder, and if so how.  It seems the Cocoa
    architecture has a well-thoughtout mechanism for archiving and I should
    try and work within it.  My problem is the example I have produces a
    binary coded disk file and I need csv text.  I want to make the Cocoa
    archive mechanism work with csv.

    I know that there are traps in this as one of my data columns could
    contain commas, so I need to worry about quotes, but for the present
    task I do have to conform to csv because I am going to propose this
    Cocoa application as a replacement for an existing web-browser based
    data entry program and it's important to show that it can simply
    replace the older program with out requiring anybody else to change.

    Denis

    Denis Stanton
    Orcon Internet Limited
    (09) 480 9299
    http://www.orcon.net.nz
  • On May 17, 2005, at 9:43 AM, Bob Ippolito wrote:

    > One alternative is to look at PyObjC, because Python ships with a
    > csv module.. though it's not optimal since current versions have a
    > limitation such that it only knows how to deal with bytestrings,
    > not unicode, so you have to encode everything into utf-8 before
    > putting it into the csv and decode it from utf-8 after getting it out.

    I've used python to read a csv file and it can't handle mac line
    endings too, which the files I need to read have.  So, with PyObjC, I
    used NSString's componentsSeparatedByString: method to read the file,
    so I have found this to be useful, but obviously you have to
    experiment given on csv files you have.  I did have to strip some
    quote marks from the ends of the array of strings, but I was able to
    still do it easily with NSString and NSArray methods.

    Brian
  • On May 16, 2005, at 10:12 PM, Brian Smith wrote:

    >
    > On May 17, 2005, at 9:43 AM, Bob Ippolito wrote:
    >
    >
    >> One alternative is to look at PyObjC, because Python ships with a
    >> csv module.. though it's not optimal since current versions have a
    >> limitation such that it only knows how to deal with bytestrings,
    >> not unicode, so you have to encode everything into utf-8 before
    >> putting it into the csv and decode it from utf-8 after getting it
    >> out.
    >
    > I've used python to read a csv file and it can't handle mac line
    > endings too, which the files I need to read have.  So, with PyObjC,
    > I used NSString's componentsSeparatedByString: method to read the
    > file, so I have found this to be useful, but obviously you have to
    > experiment given on csv files you have.  I did have to strip some
    > quote marks from the ends of the array of strings, but I was able
    > to still do it easily with NSString and NSArray methods.

    Actually it can read Mac line endings (bare '\r') just fine if you
    open the file with universal newlines (the 'U' mode).  I do this all
    the time.

    -bob
  • On 17 maj 2005, at 04.08, Denis Stanton wrote:

    > That's not the part that worries me.  My question whether I should
    > be writing this CSV conversion stuff inside the standard methods
    > encodeWithCoder and initWithCoder, and if so how.  It seems the
    > Cocoa architecture has a well-thoughtout mechanism for archiving
    > and I should try and work within it.  My problem is the example I
    > have produces a binary coded disk file and I need csv text.  I want
    > to make the Cocoa archive mechanism work with csv.

    Why not do it in the dataRepresentationOfType: and
    loadDataRepresentation:ofType: methods of your document subclass?
    NSKeyedArchiver provides a way to store / restore an object graph,
    and it provides it's own storage format. Since neither of that seems
    to be something that's really useful for you, I'd suggest that you
    avoid using it for this particular purpose.

    j o a r
  • Hello...

    No, you probably wouldn't want to implement this in terms of an
    NSCoder subclass or pattern. It might be possible, but it would make
    things significantly more complicated than they need to be. NSCoder
    is designed from a perspective of allowing each object to determine
    the best way to encode itself, and allows an object to store it's
    data in an arbitrary manner. If you attempted to implement CSV using
    NSCoder, your goal would be the exact opposite: you would need to
    make each object conform to a particular way of being encoded.

    A more applicable pattern would be to add your own methods to the
    relevant NS-base classes using categories and create your own
    separate archiving and unarchiving process, similar to the methods
    NSArray and NSDictionary objects have to read and write themselves in
    XML.

    Writing code to work for a variety of abstract cases is more
    difficult than writing code to handle a specific case (and it's not
    trivial to do it right). To start with, the best option is probably
    just to implement the code within your document subclass, based on
    the particular model you are using.

    Essentially, in your document class implementation of
    dataRepresentationOfType: and loadDataRepresentation:ofType:, instead
    of using a flavor of NSArchiver, you would use your own code to read
    and write CSV. Since you know the model you are using to feed the
    tableview, you only need to worry about the objects that your model
    uses or allows and implement the code to read and write your model as
    the CSV format you've been provided requires.

    You'll still need to work out all the "interesting" details of
    reading and writing CSV, but you don't need to try to make it fit
    within the existing NSCoder archiving mechanism.

    Hope that helps,

    Louis

    >
    > That's not the part that worries me.  My question whether I should
    > be writing this CSV conversion stuff inside the standard methods
    > encodeWithCoder and initWithCoder, and if so how.  It seems the
    > Cocoa architecture has a well-thoughtout mechanism for archiving and
    > I should try and work within it.  My problem is the example I have
    > produces a binary coded disk file and I need csv text.  I want to
    > make the Cocoa archive mechanism work with csv.
    >
    > I know that there are traps in this as one of my data columns could
    > contain commas, so I need to worry about quotes, but for the present
    > task I do have to conform to csv because I am going to propose this
    > Cocoa application as a replacement for an existing web-browser based
    > data entry program and it's important to show that it can simply
    > replace the older program with out requiring anybody else to change.
    >
    > Denis
    >
  • Hi Louis

    Thanks a lot.  This really answers my question.  I have already gone
    down this path, modifying dataRepresentationOfType: and
    loadDataRepresentation:ofType:.

    I was thinking that I could do something nicer using NSCoder
    subclasses, but the problem is NSArchiver wants to put out a lot of
    additional information about the object graph and all I want in the
    file is the simple text contents of the end nodes of that graph.  I'm
    not going to be writing a generalised csv export tool at this stage as
    this is really just a proof-of-concept program at this stage.  The
    concept being "that task that the accounts department hates would be so
    much easier if you gave them this little program - and a Mac to run it
    on"

    Thanks for a really detailed answer, right on target.

    Denis

    On May 17, 2005, at 6:49 PM, Louis C. Sacha wrote:

    > Hello...
    >
    > No, you probably wouldn't want to implement this in terms of an
    > NSCoder subclass or pattern. It might be possible, but it would make
    > things significantly more complicated than they need to be. NSCoder is
    > designed from a perspective of allowing each object to determine the
    > best way to encode itself, and allows an object to store it's data in
    > an arbitrary manner. If you attempted to implement CSV using NSCoder,
    > your goal would be the exact opposite: you would need to make each
    > object conform to a particular way of being encoded.
    >
    >
    > A more applicable pattern would be to add your own methods to the
    > relevant NS-base classes using categories and create your own separate
    > archiving and unarchiving process, similar to the methods NSArray and
    > NSDictionary objects have to read and write themselves in XML.
    >
    > Writing code to work for a variety of abstract cases is more difficult
    > than writing code to handle a specific case (and it's not trivial to
    > do it right). To start with, the best option is probably just to
    > implement the code within your document subclass, based on the
    > particular model you are using.
    >
    >
    > Essentially, in your document class implementation of
    > dataRepresentationOfType: and loadDataRepresentation:ofType:, instead
    > of using a flavor of NSArchiver, you would use your own code to read
    > and write CSV. Since you know the model you are using to feed the
    > tableview, you only need to worry about the objects that your model
    > uses or allows and implement the code to read and write your model as
    > the CSV format you've been provided requires.
    >
    > You'll still need to work out all the "interesting" details of reading
    > and writing CSV, but you don't need to try to make it fit within the
    > existing NSCoder archiving mechanism.
    >
    > Hope that helps,
    >
    > Louis
    >
    >
    >>
    >> That's not the part that worries me.  My question whether I should be
    >> writing this CSV conversion stuff inside the standard methods
    >> encodeWithCoder and initWithCoder, and if so how.  It seems the
    >> Cocoa architecture has a well-thoughtout mechanism for archiving and
    >> I should try and work within it.  My problem is the example I have
    >> produces a binary coded disk file and I need csv text.  I want to
    >> make the Cocoa archive mechanism work with csv.
    >>
    >> I know that there are traps in this as one of my data columns could
    >> contain commas, so I need to worry about quotes, but for the present
    >> task I do have to conform to csv because I am going to propose this
    >> Cocoa application as a replacement for an existing web-browser based
    >> data entry program and it's important to show that it can simply
    >> replace the older program with out requiring anybody else to change.
    >>
    >> Denis
    >>
    >>
    Denis Stanton
    Orcon Internet Limited
    (09) 480 9299
    http://www.orcon.net.nz
  • Hi  j o a r

    Thank you, that answers my question precisely, even though the answer
    is "don't do that".  It matches whet I have done.
    - Do modify dataRepresentationOfType and loadDataRepresentation:ofType:
    - Don't mess with NSKeyedArchiver and coder.  They have their own
    agenda with handling an object graph.  Even if I could write csv
    versions of the class instances NSKeyedArchiver would want to include
    the information that they were in an array so it would add information
    to the output file.

    Thanks for your help

    Denis

    On May 17, 2005, at 6:00 PM, j o a r wrote:

    >
    > On 17 maj 2005, at 04.08, Denis Stanton wrote:
    >
    >> That's not the part that worries me.  My question whether I should be
    >> writing this CSV conversion stuff inside the standard methods
    >> encodeWithCoder and initWithCoder, and if so how.  It seems the
    >> Cocoa architecture has a well-thoughtout mechanism for archiving and
    >> I should try and work within it.  My problem is the example I have
    >> produces a binary coded disk file and I need csv text.  I want to
    >> make the Cocoa archive mechanism work with csv.
    >
    > Why not do it in the dataRepresentationOfType: and
    > loadDataRepresentation:ofType: methods of your document subclass?
    > NSKeyedArchiver provides a way to store / restore an object graph, and
    > it provides it's own storage format. Since neither of that seems to be
    > something that's really useful for you, I'd suggest that you avoid
    > using it for this particular purpose.
    >
    > j o a r
    >
    >
    >
    Denis Stanton
    Orcon Internet Limited
    (09) 480 9299
    http://www.orcon.net.nz
  • On 17-mei-2005, at 4:36, Bob Ippolito wrote:

    >
    > On May 16, 2005, at 10:12 PM, Brian Smith wrote:
    >
    >
    >>
    >> On May 17, 2005, at 9:43 AM, Bob Ippolito wrote:
    >>
    >>
    >>
    >>> One alternative is to look at PyObjC, because Python ships with a
    >>> csv module.. though it's not optimal since current versions have
    >>> a limitation such that it only knows how to deal with
    >>> bytestrings, not unicode, so you have to encode everything into
    >>> utf-8 before putting it into the csv and decode it from utf-8
    >>> after getting it out.
    >>>
    >>
    >> I've used python to read a csv file and it can't handle mac line
    >> endings too, which the files I need to read have.  So, with
    >> PyObjC, I used NSString's componentsSeparatedByString: method to
    >> read the file, so I have found this to be useful, but obviously
    >> you have to experiment given on csv files you have.  I did have to
    >> strip some quote marks from the ends of the array of strings, but
    >> I was able to still do it easily with NSString and NSArray methods.
    Fields containing quotes and comma's require additional work. Using
    the csv module requires less thought.

    >>
    >
    > Actually it can read Mac line endings (bare '\r') just fine if you
    > open the file with universal newlines (the 'U' mode).  I do this
    > all the time.

    Technically that is not correct, you need to use lineterminator='\r'
    in the dialect.

    Ronald
  • On May 18, 2005, at 4:49 PM, Ronald Oussoren wrote:

    >
    > On 17-mei-2005, at 4:36, Bob Ippolito wrote:
    >
    >
    >>
    >> On May 16, 2005, at 10:12 PM, Brian Smith wrote:
    >>
    >>
    >>
    >>>
    >>> On May 17, 2005, at 9:43 AM, Bob Ippolito wrote:
    >>>
    >>>
    >>>
    >>>
    >>>> One alternative is to look at PyObjC, because Python ships with
    >>>> a csv module.. though it's not optimal since current versions
    >>>> have a limitation such that it only knows how to deal with
    >>>> bytestrings, not unicode, so you have to encode everything into
    >>>> utf-8 before putting it into the csv and decode it from utf-8
    >>>> after getting it out.
    >>>>
    >>>>
    >>>
    >>> I've used python to read a csv file and it can't handle mac line
    >>> endings too, which the files I need to read have.  So, with
    >>> PyObjC, I used NSString's componentsSeparatedByString: method to
    >>> read the file, so I have found this to be useful, but obviously
    >>> you have to experiment given on csv files you have.  I did have
    >>> to strip some quote marks from the ends of the array of strings,
    >>> but I was able to still do it easily with NSString and NSArray
    >>> methods.
    >>>
    > Fields containing quotes and comma's require additional work. Using
    > the csv module requires less thought.

    Yeah, stripping quotes is not enough, because you'll end up with
    columns that are broken in the middle due to a comma being present..
    of course, you can hack around this until it works, but I'd highly
    recommend using a mostly correct implementation of a CSV parser in
    the first place :)

    >> Actually it can read Mac line endings (bare '\r') just fine if you
    >> open the file with universal newlines (the 'U' mode).  I do this
    >> all the time.
    >>
    >
    > Technically that is not correct, you need to use
    > lineterminator='\r' in the dialect.

    It's perfectly correct for *reading* csv files.  Universal newlines
    will simply convert '\r' to '\n' before the csv reader sees it,
    allowing you to read csv files of any line ending (even mixed)
    without issues.

    For writing csv files, you might care about the line terminator.. but
    for reading them, it's VERY convenient to use universal newlines
    especially when you're dealing with both the Windows and Macintosh
    version of Excel, interchangeably, for example.

    -bob
  • On May 18, 2005, at 5:02 PM, Bob Ippolito wrote:
    > For writing csv files, you might care about the line terminator.. but
    > for reading them, it's VERY convenient to use universal newlines
    > especially when you're dealing with both the Windows and Macintosh
    > version of Excel, interchangeably, for example.

    Speaking of Excel, Microsoft does not use standard CSV:
    <http://ostermiller.org/utils/ExcelCSV.html>.  This might or might not
    matter to you -- just thought I'd point it out.

    --Andy
previous month may 2005 next month
MTWTFSS
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
Go to today
MindNode
MindNode offered a free license !