Skip navigation.
 
mlRe: characterAtIndex: method and composite characters
FROM : Douglas Davidson
DATE : Wed Apr 04 18:42:48 2007

On Apr 4, 2007, at 8:05 AM, Ewan Delanoy wrote:

>  -when an NSString or
> NSAttributedString (let's call it s) appears on-screen as, say, "(a 
> with
> tilda)(other characters ...)" is
> it guaranteed that  [s characterAtIndex: 0] will be "a with tilda", 
> and
> not "a" (with "tilda" for a second
> character) ?
>
>  -If this is not the case, I need a more accurate version of
> "characterAtIndex:". Is this already
> built-in ?


Yes.  The characterAtIndex: method should be avoided wherever 
possible; with Unicode strings, examining a single character usually 
is not sufficient.  Instead, use methods like compare:options:range:, 
rangeOfString:options:range:, and 
rangeOfCharacterFromSet:options:range:, which will give you the 
Unicode-conformant operations you are looking for, with a wide 
variety of options.

If you need to extract substrings, be sure to use 
rangeOfComposedCharacterSequenceAtIndex: to make sure that you are 
not dividing a composed character sequence.  If you wish to replace 
substrings in a mutable string, try 
replaceOccurrencesOfString:withString:options:range:.

NSString does have methods to precompose or decompose an entire 
string, but these methods are really useful only in special 
circumstances--for example, when you are dealing with existing code 
that for some reason requires one form or the other.  Bear in mind 
that most combinations of base characters and combining marks do not 
have precomposed forms.  In general, you are better off using the 
methods mentioned above for Unicode-conformant comparisons.

Douglas Davidson