FROM : Aki Inoue
DATE : Wed Jan 15 01:39:06 2003
Renaud,
I think we're talking in the same line.
\\300 is 0x00C0 in octal and is "A grave".
It is usually called the precomposed form.
And "A \U0300" is the decomposed form.
> So I used getCharacters but somehitng isn't working still. I think I
> may have asked part of my question backwards. Boy, Unicode is not too
> simple! Perhaps with an example.
Exactly what's not working ?
Aki
On 2003.1.14, at 04:11 PM, Renaud Boisjoly wrote:
> Hi again
>
> So I used getCharacters but somehitng isn't working still. I think I
> may have asked part of my question backwards. Boy, Unicode is not too
> simple! Perhaps with an example.
>
> Say the string I need to convert is "A acute". It first looks like:
> \\300
>
> But what I need is:
> A\\u0300
>
> I'm not sure yet how each is supposed to be called.
>
> I get the feeling that the routine you so kindly put together actually
> does the opposite... is this possible? If so, I tried inverting some
> of the parameters in CreateTextConverter, but it fails to convert
> anything... any clues?
>
> Thanks again to all for helping out!
>
> Renaud
>
> On Tuesday, January 14, 2003, at 05:44 PM, Aki Inoue wrote:
>
>> Renaud,
>>
>> You can use getCharacters: to bulk-get characters from NSString.
>>
>> One thing to note if you're using stack buffer in a loop as in your
>> original example.
>>
>> Depending on your needs in decomposed format, you have to be a little
>> bit more careful at the end of each buffer run.
>>
>> For example, let's assume your source NSString contains the following
>> character sequence "U0104 U0300" LATIN CAPITAL LETTER A WITH OGONEK
>> and COMBINING GRAVE ACCENT. "!" (This should display correctly in
>> Mail.app).
>> When decompose, they can be either "U0041 U0328 U0300" or "U0041
>> U0300 U0328". They are both perfectly legal Unicode character
>> sequences, but only the latter is canonically decomposed format.
>> Back to the NSString with these character sequences, you won't get
>> the canonical format if your working buffer ends between U0104 and
>> U0300 since TEC cannot know the next character in that case.
>>
>> So, if you want to have canonically decomposed format (not just
>> decomposed), you need to make sure your working buffer ends BEFORE a
>> base character (![[NSCharacter nonBaseCharacterSet]
>> characterIsMember:theChar]). You don't have to worry about
>> surrogates since pre-Jaguar TEC doesn't recognize them.
>>
>> Aki
>>
>> On 2003.1.14, at 01:08 PM, Renaud Boisjoly wrote:
>>
>>> Hi again
>>>
>>> Ok, I think it will work, but I do have a last newbie question to
>>> ask if I can...
>>>
>>> I've managed to convert from the UniChar result to an NSString, but
>>> I'm not clear on how to efficiently do the reverse. My original
>>> string is in an NSString and I guess I need to convert it to
>>> UniChar... but being pretty unexperienced, this looks like a mystery
>>> to me. Do I need to iterate through each character using
>>> characterAtIndex and add them to characters[] one by one? Should I
>>> use an NSScanner? Is there an immensely obvious way to do this and
>>> I'm just not seeing it (probably). I now its probably something I
>>> should know, but considering I've only been programming for a year
>>> or so except for stuff like AppleScript, I miss a lot of things.
>>>
>>> My current idea is a for loop using characterAtIndex to add each
>>> character...
>>>
>>> Thanks for your time if you can afford it.
>>>
>>> Renaud
>>>
>>> On Tuesday, January 14, 2003, at 02:39 PM, Aki Inoue wrote:
>>>
>>>> #import <Foundation/Foundation.h>
>>>>
>>>> static UniChar characters[] = ; // LATIN CAPITAL LETTER A
>>>> WITH GRAVE
>>>>
>>>> #define MAX_BUFFER_LENGTH (100)
>>>>
>>>> int main (int argc, const char * argv[]) {
>>>> NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
>>>> UnicodeToTextInfo textInfo;
>>>> UnicodeMapping mapping =
>>>> {CreateTextEncoding(kTextEncodingUnicodeDefault,
>>>> kTextEncodingDefaultVariant, kUnicode16BitFormat),
>>>> CreateTextEncoding(kTextEncodingUnicodeDefault,
>>>> kUnicodeCanonicalDecompVariant, kUnicode16BitFormat),
>>>> kUnicodeUseLatestMapping};
>>>> UniChar buffer[MAX_BUFFER_LENGTH];
>>>> ByteCount inputRead, outputLen;
>>>> OSStatus status;
>>>>
>>>> status = CreateUnicodeToTextInfo(&mapping, &textInfo);
>>>> if (noErr != status) {
>>>> NSLog(@"Failed to create UnicodeToTextInfo");
>>>> exit(1);
>>>> }
>>>>
>>>> status = ConvertFromUnicodeToText(textInfo, sizeof(characters),
>>>> characters, kTECKeepInfoFixMask, 0, NULL, NULL, NULL,
>>>> MAX_BUFFER_LENGTH * sizeof(UniChar), &inputRead, &outputLen, >>>
>>>> buffer);
>>>> if (noErr != status) {
>>>> NSLog(@"Failed to convert string");
>>>> exit(1);
>>>> }
>>>>
>>>> DisposeUnicodeToTextInfo(&textInfo);
>>>>
>>>> [pool release];
>>>> return 0;
>>>> }
>> _______________________________________________
>> cocoa-dev mailing list | <email_removed>
>> Help/Unsubscribe/Archives:
>> http://www.lists.apple.com/mailman/listinfo/cocoa-dev
>> Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | <email_removed>
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
DATE : Wed Jan 15 01:39:06 2003
Renaud,
I think we're talking in the same line.
\\300 is 0x00C0 in octal and is "A grave".
It is usually called the precomposed form.
And "A \U0300" is the decomposed form.
> So I used getCharacters but somehitng isn't working still. I think I
> may have asked part of my question backwards. Boy, Unicode is not too
> simple! Perhaps with an example.
Exactly what's not working ?
Aki
On 2003.1.14, at 04:11 PM, Renaud Boisjoly wrote:
> Hi again
>
> So I used getCharacters but somehitng isn't working still. I think I
> may have asked part of my question backwards. Boy, Unicode is not too
> simple! Perhaps with an example.
>
> Say the string I need to convert is "A acute". It first looks like:
> \\300
>
> But what I need is:
> A\\u0300
>
> I'm not sure yet how each is supposed to be called.
>
> I get the feeling that the routine you so kindly put together actually
> does the opposite... is this possible? If so, I tried inverting some
> of the parameters in CreateTextConverter, but it fails to convert
> anything... any clues?
>
> Thanks again to all for helping out!
>
> Renaud
>
> On Tuesday, January 14, 2003, at 05:44 PM, Aki Inoue wrote:
>
>> Renaud,
>>
>> You can use getCharacters: to bulk-get characters from NSString.
>>
>> One thing to note if you're using stack buffer in a loop as in your
>> original example.
>>
>> Depending on your needs in decomposed format, you have to be a little
>> bit more careful at the end of each buffer run.
>>
>> For example, let's assume your source NSString contains the following
>> character sequence "U0104 U0300" LATIN CAPITAL LETTER A WITH OGONEK
>> and COMBINING GRAVE ACCENT. "!" (This should display correctly in
>> Mail.app).
>> When decompose, they can be either "U0041 U0328 U0300" or "U0041
>> U0300 U0328". They are both perfectly legal Unicode character
>> sequences, but only the latter is canonically decomposed format.
>> Back to the NSString with these character sequences, you won't get
>> the canonical format if your working buffer ends between U0104 and
>> U0300 since TEC cannot know the next character in that case.
>>
>> So, if you want to have canonically decomposed format (not just
>> decomposed), you need to make sure your working buffer ends BEFORE a
>> base character (![[NSCharacter nonBaseCharacterSet]
>> characterIsMember:theChar]). You don't have to worry about
>> surrogates since pre-Jaguar TEC doesn't recognize them.
>>
>> Aki
>>
>> On 2003.1.14, at 01:08 PM, Renaud Boisjoly wrote:
>>
>>> Hi again
>>>
>>> Ok, I think it will work, but I do have a last newbie question to
>>> ask if I can...
>>>
>>> I've managed to convert from the UniChar result to an NSString, but
>>> I'm not clear on how to efficiently do the reverse. My original
>>> string is in an NSString and I guess I need to convert it to
>>> UniChar... but being pretty unexperienced, this looks like a mystery
>>> to me. Do I need to iterate through each character using
>>> characterAtIndex and add them to characters[] one by one? Should I
>>> use an NSScanner? Is there an immensely obvious way to do this and
>>> I'm just not seeing it (probably). I now its probably something I
>>> should know, but considering I've only been programming for a year
>>> or so except for stuff like AppleScript, I miss a lot of things.
>>>
>>> My current idea is a for loop using characterAtIndex to add each
>>> character...
>>>
>>> Thanks for your time if you can afford it.
>>>
>>> Renaud
>>>
>>> On Tuesday, January 14, 2003, at 02:39 PM, Aki Inoue wrote:
>>>
>>>> #import <Foundation/Foundation.h>
>>>>
>>>> static UniChar characters[] = ; // LATIN CAPITAL LETTER A
>>>> WITH GRAVE
>>>>
>>>> #define MAX_BUFFER_LENGTH (100)
>>>>
>>>> int main (int argc, const char * argv[]) {
>>>> NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
>>>> UnicodeToTextInfo textInfo;
>>>> UnicodeMapping mapping =
>>>> {CreateTextEncoding(kTextEncodingUnicodeDefault,
>>>> kTextEncodingDefaultVariant, kUnicode16BitFormat),
>>>> CreateTextEncoding(kTextEncodingUnicodeDefault,
>>>> kUnicodeCanonicalDecompVariant, kUnicode16BitFormat),
>>>> kUnicodeUseLatestMapping};
>>>> UniChar buffer[MAX_BUFFER_LENGTH];
>>>> ByteCount inputRead, outputLen;
>>>> OSStatus status;
>>>>
>>>> status = CreateUnicodeToTextInfo(&mapping, &textInfo);
>>>> if (noErr != status) {
>>>> NSLog(@"Failed to create UnicodeToTextInfo");
>>>> exit(1);
>>>> }
>>>>
>>>> status = ConvertFromUnicodeToText(textInfo, sizeof(characters),
>>>> characters, kTECKeepInfoFixMask, 0, NULL, NULL, NULL,
>>>> MAX_BUFFER_LENGTH * sizeof(UniChar), &inputRead, &outputLen, >>>
>>>> buffer);
>>>> if (noErr != status) {
>>>> NSLog(@"Failed to convert string");
>>>> exit(1);
>>>> }
>>>>
>>>> DisposeUnicodeToTextInfo(&textInfo);
>>>>
>>>> [pool release];
>>>> return 0;
>>>> }
>> _______________________________________________
>> cocoa-dev mailing list | <email_removed>
>> Help/Unsubscribe/Archives:
>> http://www.lists.apple.com/mailman/listinfo/cocoa-dev
>> Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | <email_removed>
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.






Cocoa mail archive

