Skip navigation.
 
mlBest way to parse XML data of non-ASCII encoding...?
FROM : Simon Liu
DATE : Tue Apr 05 19:12:16 2005

Hi,

I am doing some XML parsing for 10.2+, thus I am using Core
Foundation's XML functions, such as CFXMLTreeCreateFromData().

Things are working fine except for XML files with non-ASCII
characters.  The functions seem to ignore the encoding attribute of
the xml tag, such as in:

<?xml version="1.0" encoding="shift_jis" standalone="yes"?>

Given an XML file in the above encoding, with Japanese characters as
values between tags, the routines crash.

However, if I first convert the file to UTF8, things work fine...

NSString *s = [NSString stringWithContentsOfURL:sourceURL];
NSData *xmlData = [s dataUsingEncoding:NSUTF8StringEncoding];
// use as CFDataRef in CFXMLTreeCreateFromData()

Is this the expected behaviour?  Is there a more elegant way to parse
non-ASCII XML files?

Regards,
Simon

Related mailsAuthorDate
mlBest way to parse XML data of non-ASCII encoding...? Simon Liu Apr 5, 19:12
mlRe: Best way to parse XML data of non-ASCII encoding...? Simon Liu Apr 5, 19:46
mlRe: Best way to parse XML data of non-ASCII encoding...? Kevin Viggers Apr 7, 07:08
mlRe: Best way to parse XML data of non-ASCII encoding...? Simon Liu Apr 7, 12:53