Skip navigation.
 
mlRe: endian problems with UTF16 on Intel Macs
FROM : Chris Suter
DATE : Tue Aug 29 13:56:55 2006

On 29/08/2006, at 9:42 PM, Ricky Sharp wrote:

>
> On Tuesday, August 29, 2006, at 00:59AM, Chris Suter 
> <<email_removed>> wrote:
>

>>
>> On 29/08/2006, at 3:47 PM, Donald Hall wrote:
>>

>>> Furthermore, I understood that "external representation" was always
>>> big endian.

>>
>> No. The representation is as dictated by the encoding. Some encodings
>> don't have an endian aspect to them (UTF-8 for example). I'm guessing
>> if you pick kCFStringEncodingUTF16, OS X is free to choose big-endian
>> or little-endian.

>
> Not quite.  According to <http://www.unicode.org/faq/

> utf_bom.html#36>, unmarked UTF-16 and UTF-32 uses big-endian by 
> default.  I would expect the Cocoa frameworks to honor that default.


But Cocoa can write the byte order mark, and as it turns out, I'm right.

#include <CoreFoundation/CoreFoundation.h>
#include <stdio.h>

int main ()
{
  CFDataRef data;

  data = CFStringCreateExternalRepresentation (NULL, CFSTR ("test"), 
kCFStringEncodingUTF16, 0);

  int i;

  for (i = 0; i < CFDataGetLength (data); ++i)
    printf ("%02x ", CFDataGetBytePtr (data)[i]);

  putchar ('\n');

  return 0;
}

produces:

ff fe 74 00 65 00 73 00 74 00

on an Intel machine.

- Chris

Related mailsAuthorDate
mlendian problems with UTF16 on Intel Macs Donald Hall Aug 29, 07:47
mlRe: endian problems with UTF16 on Intel Macs Chris Suter Aug 29, 07:58
mlRe: endian problems with UTF16 on Intel Macs Ricky Sharp Aug 29, 13:42
mlRe: endian problems with UTF16 on Intel Macs Chris Suter Aug 29, 13:56
mlRe: endian problems with UTF16 on Intel Macs Ricky Sharp Aug 29, 17:23
mlRe: endian problems with UTF16 on Intel Macs [SOLVED] Donald Hall Aug 30, 07:14
mlRe: endian problems with UTF16 on Intel Macs [SOLVED] Chris Suter Aug 30, 07:58