Converting *u unicode hex sequences

  • Hello list.

    My code receives NSStrings containing unicode hex escapes that start
    with asterisk (*) instead of backslash (\), and I need to decode
    these strings.  For example, I need to convert @"hello*u0020world" to
    @"hello world".

    Using NSMutableString replaceOccurrencesOfString:withString:range: to
    convert @"*" to @"\\", yields @"hello\u0020world" as expected, but I
    can't seem to get from there to @"hello world".

    I realize that \u0020 is neither percent-escaped nor UTF-8 encoded.
    I've tried searching this list's archives, and the web in general,
    all tried the variants of
    stringByReplacingPercentEscapesUsingEncoding: and
    stringWithCString:encoding: I can think of. I scanned the CFString
    docs, but none of the functions jumped out at me.

    So, how does one convert from @"hello*u0020world" to @"hello world",
    short of parsing the hex sequences?

    Thanks.
  • You get to parse the hex sequences. Backslash escapes are only
    handled by the compiler. Once your code is compiled, backslashes are
    no longer treated special in any way.

    You could make a regular expression that matches \*([0-9A-Fa-f]
    {1-4}). Convert from ASCII hex to an int by calling sscanf using "%
    x". Then use replaceCharactersInRange:withString: to pull out the
    *ABCD and replace it with your character.

    On Nov 4, 2007, at 7:12 PM, <ncdev05...> wrote:

    > Hello list.
    >
    > My code receives NSStrings containing unicode hex escapes that
    > start with asterisk (*) instead of backslash (\), and I need to
    > decode these strings.  For example, I need to convert
    > @"hello*u0020world" to @"hello world".
    >
    > Using NSMutableString replaceOccurrencesOfString:withString:range:
    > to convert @"*" to @"\\", yields @"hello\u0020world" as expected,
    > but I can't seem to get from there to @"hello world".
    >
    > I realize that \u0020 is neither percent-escaped nor UTF-8 encoded.
    > I've tried searching this list's archives, and the web in general,
    > all tried the variants of
    > stringByReplacingPercentEscapesUsingEncoding: and
    > stringWithCString:encoding: I can think of. I scanned the CFString
    > docs, but none of the functions jumped out at me.
    >
    > So, how does one convert from @"hello*u0020world" to @"hello
    > world", short of parsing the hex sequences?
    >
    > Thanks.
  • If you are running on 10.4 or later, you can use CFStringTransform.

    CFStringTransform(myMutableString, range, CFSTR("Hex-Any"), NO);

    "Hex-Any" is equivalent to "Hex-Any/Java", and the \u style is the one
    used in Java.

    Deborah Goldsmith
    Internationalization, Unicode Liaison
    Apple Inc.
    <goldsmit...>

    On Nov 4, 2007, at 7:12 PM, <ncdev05...> wrote:

    > Hello list.
    >
    > My code receives NSStrings containing unicode hex escapes that start
    > with asterisk (*) instead of backslash (\), and I need to decode
    > these strings.  For example, I need to convert @"hello*u0020world"
    > to @"hello world".
    >
    > Using NSMutableString replaceOccurrencesOfString:withString:range:
    > to convert @"*" to @"\\", yields @"hello\u0020world" as expected,
    > but I can't seem to get from there to @"hello world".
    >
    > I realize that \u0020 is neither percent-escaped nor UTF-8 encoded.
    > I've tried searching this list's archives, and the web in general,
    > all tried the variants of
    > stringByReplacingPercentEscapesUsingEncoding: and
    > stringWithCString:encoding: I can think of. I scanned the CFString
    > docs, but none of the functions jumped out at me.
    >
    > So, how does one convert from @"hello*u0020world" to @"hello world",
    > short of parsing the hex sequences?
    >
    > Thanks.
previous month november 2007 next month
MTWTFSS
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    
Go to today