NSXML encoding question...

  • Hi - I want to encode the following information in an XML document:

    <code>
    addr = *PC + ( *(PC + 1) << 8);
    </code>

    If I try the above syntax and parse it using NSXLMParser I get
    'parseErrorOccurred - Error NSXMLParserErrorDomain 23, (null)'

    It seems to be the less-than symbols which are causing the trouble.
    I've had a look at the XML specification and as far as I can tell I
    should be able to encode the less-than characters like so:

    <code>
    addr = *PC + ( *(PC + 1) &lt&lt 8);
    </code>

    But this too causes parse errors.

    Can anyone tell me how I can encode these characters or point me in
    the direction of some useful documentation ? Thanks.

    Martin Linklater
  • Use "&lt;" not "&lt".

    On 8/26/07, Martin Linklater <mslinklater...> wrote:
    >
    > Hi - I want to encode the following information in an XML document:
    >
    > <code>
    > addr = *PC + ( *(PC + 1) << 8);
    > </code>
    >
    > If I try the above syntax and parse it using NSXLMParser I get
    > 'parseErrorOccurred - Error NSXMLParserErrorDomain 23, (null)'
    >
    > It seems to be the less-than symbols which are causing the trouble.
    > I've had a look at the XML specification and as far as I can tell I
    > should be able to encode the less-than characters like so:
    >
    > <code>
    > addr = *PC + ( *(PC + 1) &lt&lt 8);
    > </code>
    >
    > But this too causes parse errors.
    >
    > Can anyone tell me how I can encode these characters or point me in
    > the direction of some useful documentation ? Thanks.
    >
    > Martin Linklater
    >
  • On Aug 26, 2007, at 8:05 AM, Martin Linklater wrote:

    > Hi - I want to encode the following information in an XML document:
    >
    > <code>
    > addr = *PC + ( *(PC + 1) << 8);
    > </code>
    >
    > If I try the above syntax and parse it using NSXLMParser I get
    > 'parseErrorOccurred - Error NSXMLParserErrorDomain 23, (null)'
    >
    > It seems to be the less-than symbols which are causing the trouble.
    > I've had a look at the XML specification and as far as I can tell I
    > should be able to encode the less-than characters like so:
    >
    > <code>
    > addr = *PC + ( *(PC + 1) &lt&lt 8);
    > </code>

    You are almost there

    <code>
    addr = *PC + ( *(PC + 1) &lt;&lt; 8);
    </code>

    you were missing the ";" and the end of the entities.

    Jim

    --

    /"\  ASCII Ribbon Campaign  .
    \ / - NO HTML/RTF in e-mail  .
      X  - NO Word docs in e-mail .
    / \ -----------------------------------------------------------------
                          http://www.FreeBSD.org    The Power to Serve
    <jim...>  http://www.TheHousleys.net
    ---------------------------------------------------------------------
    In theory there is no difference between theory and practice.
    In practice there is no similarity.
          -- From the "I wish I'd said that" archives.
  • Martin Linklater <mailto:<mslinklater...> wrote (Sunday,
    August 26, 2007 5:05 AM +0100):
    > <code>
    > addr = *PC + ( *(PC + 1) &lt&lt 8);
    > </code>

    As others have pointed out, it's '&lt;'. XML entities have the
    general form '&' <entity description> ';', where the description
    can be the name of a named entity or a literal constant (i.e. '&#60;').

    However, if you want to encode literal data that may, or may
    not, contain reserved XML characters it's easier to use a CDATA block:

        <code>
            <![CDATA[
                addr = *PC + ( *(PC + 1) << 8); // & other
    <code>crazy</code> stuff!
            ]]>
        </code>

    Everything between '<![CDATA[' and ']]>' is read as a literal
    string of characters and is not interpreted as XML.

    --
    James Bucanek
  • You could try something like this (not 100% sure it will work):

    <code>
    <![CDATA[addr = *PC + ( *(PC + 1) << 8);]]>
    </code>

    And then use parser:foundCDATA: to get what you want.

    Paulo F. Andrade <52439...>
    mailto: <pfca...>

    On 2007/08/26, at 13:35, James Housley wrote:

    > On Aug 26, 2007, at 8:05 AM, Martin Linklater wrote:
    >
    >> Hi - I want to encode the following information in an XML document:
    >>
    >> <code>
    >> addr = *PC + ( *(PC + 1) << 8);
    >> </code>
    >>
    >> If I try the above syntax and parse it using NSXLMParser I get
    >> 'parseErrorOccurred - Error NSXMLParserErrorDomain 23, (null)'
    >>
    >> It seems to be the less-than symbols which are causing the
    >> trouble. I've had a look at the XML specification and as far as I
    >> can tell I should be able to encode the less-than characters like so:
    >>
    >> <code>
    >> addr = *PC + ( *(PC + 1) &lt&lt 8);
    >> </code>
    >
    > You are almost there
    >
    > <code>
    > addr = *PC + ( *(PC + 1) &lt;&lt; 8);
    > </code>
    >
    > you were missing the ";" and the end of the entities.
    >
    > Jim
    >
    > --
    >
    > /"\  ASCII Ribbon Campaign  .
    > \ / - NO HTML/RTF in e-mail  .
    > X  - NO Word docs in e-mail .
    > / \ -----------------------------------------------------------------
    > http://www.FreeBSD.org    The Power to Serve
    > <jim...>  http://www.TheHousleys.net
    > ---------------------------------------------------------------------
    > In theory there is no difference between theory and practice.
    > In practice there is no similarity.
    > -- From the "I wish I'd said that" archives.
  • A CDATA block is probably the correct way to go here.
    I would still caution that one could construct valid C, ObjC or C++ code that contained "]]>" in it, so you still need to be a bit careful...

    if( [a multiplyWith:[b value]]>25 )
    {
        NSLog(@"whoops! CDATA trouble");
    }

    Unfortunately I don't have a lot of CDATA experience so I don't know the best workaround for this sort of situation.

    Personally I've always just written a quick function which replaces < > & in a string with the appropriate XML entities. In fact Cocoa may already have such a function handy; not sure. (My XML parsing code has always needed to be cross platform.)


    ________________________________

    From: cocoa-dev-bounces+jstiles=<blizzard.com...> on behalf of James Bucanek
    Sent: Sun 8/26/2007 9:24 AM
    To: Martin Linklater
    Cc: <cocoa-dev...>
    Subject: Re: NSXML encoding question...

    Martin Linklater <mailto:<mslinklater...> wrote (Sunday,
    August 26, 2007 5:05 AM +0100):
    > <code>
    > addr = *PC + ( *(PC + 1) &lt&lt 8);
    > </code>

    As others have pointed out, it's '&lt;'. XML entities have the
    general form '&' <entity description> ';', where the description
    can be the name of a named entity or a literal constant (i.e. '&#60;').

    However, if you want to encode literal data that may, or may
    not, contain reserved XML characters it's easier to use a CDATA block:

        <code>
            <![CDATA[
                addr = *PC + ( *(PC + 1) << 8); // & other
    <code>crazy</code> stuff!
            ]]>
        </code>

    Everything between '<![CDATA[' and ']]>' is read as a literal
    string of characters and is not interpreted as XML.

    --
    James Bucanek
  • John Stiles <mailto:<jstiles...> wrote (Sunday, August
    26, 2007 12:10 PM -0700):
    > Unfortunately I don't have a lot of CDATA experience so I don't know
    > the best workaround for this sort of situation.

    The recommended solution is to encode the string using two CDATA
    tags, splitting the sequence "]]>" so it doesn't occur literally
    in the document:

        <![CDATA[if( [a multiplyWith:[b value]]]]><![CDATA[>25 )]]>

    <http://en.wikipedia.org/wiki/CDATA#Uses_of_CDATA_sections>

    --
    James Bucanek
  • On 26 Aug 2007, at 21:28, James Bucanek wrote:

    >
    > The recommended solution is to encode the string using two CDATA
    > tags, splitting the sequence "]]>" so it doesn't occur literally in
    > the document:
    >
    > <![CDATA[if( [a multiplyWith:[b value]]]]><![CDATA[>25 )]]>
    >
    > <http://en.wikipedia.org/wiki/CDATA#Uses_of_CDATA_sections>

    Thanks for all your suggestions. I'll use CDATA blocks.

    Martin Linklater
previous month august 2007 next month
MTWTFSS
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
Go to today