FROM : Gary L. Wade
DATE : Thu May 08 20:07:33 2008
Sorry, I misread your suggested method, but, as Adam points out, it still isn't adequate for someone who has free-styled 8-bit text with no idea what the original encoding was.
>
>On Wednesday, May 07, 2008, at 12:37PM, "Jean-Daniel Dupas"
><<email_removed>> wrote:
>>What make you think this function assumes an exact encoding ? This
>>method is not the same than +[NSString
>>stringWithContentsOfFile:encoding:error:].
>>
>>The method +stringWithContentsOfFile:usedEncoding:error: returns the
>>sniffed encoding by reference using the second argument. At least
>>that's what the documentation says: “ This method attempts to
>>determine the encoding of the file at path.”
>>This method was introduced in Tiger, that's maybe why you never see it
>>before.
>
>Unfortunately, that method doesn't work unless you have UTF-16 or UTF-32 with
>a BOM on Tiger, which makes it less useful than it might be. On Leopard it
>reads xattrs, then tries UTF-8 if it's not UTF-16/32, but it certainly doesn't
>sniff encodings like TEC. I was never motivated enough to figure out TEC, so
>basically ended up checking for BOM, trying UTF-8, and then using MacRoman if
>all else failed.
>
>--
>adam
DATE : Thu May 08 20:07:33 2008
Sorry, I misread your suggested method, but, as Adam points out, it still isn't adequate for someone who has free-styled 8-bit text with no idea what the original encoding was.
>
>On Wednesday, May 07, 2008, at 12:37PM, "Jean-Daniel Dupas"
><<email_removed>> wrote:
>>What make you think this function assumes an exact encoding ? This
>>method is not the same than +[NSString
>>stringWithContentsOfFile:encoding:error:].
>>
>>The method +stringWithContentsOfFile:usedEncoding:error: returns the
>>sniffed encoding by reference using the second argument. At least
>>that's what the documentation says: “ This method attempts to
>>determine the encoding of the file at path.”
>>This method was introduced in Tiger, that's maybe why you never see it
>>before.
>
>Unfortunately, that method doesn't work unless you have UTF-16 or UTF-32 with
>a BOM on Tiger, which makes it less useful than it might be. On Leopard it
>reads xattrs, then tries UTF-8 if it's not UTF-16/32, but it certainly doesn't
>sniff encodings like TEC. I was never motivated enough to figure out TEC, so
>basically ended up checking for BOM, trying UTF-8, and then using MacRoman if
>all else failed.
>
>--
>adam






Cocoa mail archive

