parsing string data

  • Hi all,
    I have a NSString representing a block of text, in the following format:

    -----
    MESSAGETYPE
    header1:value
    header2:value

    TEXT
    ----

    where there are variable number of headers and no guarantee of occuring in
    the same order. I want to parse this data into an NSDictionary object, with
    headers as key and the values as value(!) and also the
    "TEXT" in the dictionary. I have experience in perl/python, where I
    would have setup regular expressions to catch the headers.

    However Im at
    loss on how to do the same in objective-c & cocoa. Any help is
    appreciated. Thanks.

    - Sandeep
  • On Dec 20, 2007, at 8:30 AM, C Sandeep wrote:

    > I have experience in perl/python, where I would have setup regular
    > expressions to catch the headers.
    > However Im at loss on how to do the same in objective-c & cocoa. Any
    > help is appreciated.

    Unfortunately there is no comprehensive regex facility in Cocoa. That
    said, it seems that you should be able to accomplish what you want by
    using NSScanner. Another way to do it might be to include one of the
    many third party Cocoa / C regex libraries that exists (see: Google).
    What solution to choose would depend on your needs regarding:
    performance, licensing restrictions, unicode support, etc.

    j o a r
  • (I think I mailed directly to the replier, sorry for bugging.)

    Thanks. I think I will take the NSScanner route. From the apple doc
    examples, it wasn't clear how to acheive my objective. Can you give me a
    hint. Thanks.

    - Sandeep

    On 12/20/07, C Sandeep <csandeep...> wrote:
    >
    > Thanks. I think I will take the NSScanner route. From the apple doc
    > examples, it wasn't clear how to acheive my objective. Can you
    > give me a hint. Thanks.
    >
    > - Sandeep
    >
    > On 12/20/07, j o a r <joar...> wrote:
    >>
    >>
    >> Unfortunately there is no comprehensive regex facility in Cocoa. That
    >> said, it seems that you should be able to accomplish what you want by
    >> using NSScanner. Another way to do it might be to include one of the
    >> many third party Cocoa / C regex libraries that exists (see: Google).
    >> What solution to choose would depend on your needs regarding:
    >> performance, licensing restrictions, unicode support, etc.
    >>
    >> j o a r
    >>
    >>
    >>
    >
  • On Dec 20, 2007, at 8:47 AM, C Sandeep wrote:

    > Thanks. I think I will take the NSScanner route. From the apple doc
    > examples, it wasn't clear how to acheive my objective. Can you give
    > me a hint. Thanks.

    NSScanner works by linearly searching through a string for sequences
    of characters, and optionally also extracting values of different
    types as it goes along. You would have to use your knowledge about the
    format of the data that you're parsing to use NSScanner correctly. You
    don't configure a scanner object and then have it return all the data
    that you're looking for in one go, instead you use the scanner to step-
    by-step, "manually", search through the string for each individual
    part of the data set that you're looking for.

    In your example it might help to break the string up on rows first, to
    more easily parse out the message type and the headers.

    Some of the sample code that ships with the dev tools uses NSScanner,
    you might find something of interest:

    $ mdfind -onlyin /Developer/Examples -interpret NSScanner

    There is also a bit of documentation here:

    <http://developer.apple.com/documentation/Cocoa/Conceptual/Strings/Articles/
    Scanners.html
    >

    Another good place to search for sample code is Google Code Search:

    <http://www.google.com/codesearch>

    j o a r
previous month december 2007 next month
MTWTFSS
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            
Go to today