FROM : Douglas Davidson
DATE : Tue Jan 29 20:20:16 2008
On Jan 29, 2008, at 11:05 AM, Citizen wrote:
> You could get close with generating the characters you expect to
> find at the word boundaries with:
>
> NSCharacterSet * wordBoundriesCharacterSet = [[NSCharacterSet
> letterCharacterSet] invertedSet];
>
> You would need to change this accordingly if you did not want
> numbers to be considered as a word boundary. You could of course
> create a boundary character set with just whitespace and punctuation
> marks - it just depends on how you would like the final feature to
> work.
It's better not to reinvent the wheel for this sort of tokenization.
There is API for word-boundary analysis in AppKit (doubleClickAtIndex:
et al.), and, starting in Leopard, also in CoreFoundation
(CFStringTokenizer), that handles this in a consistent and standards-
appropriate fashion.
The current find panel implementation works by searching for the
string in question using any appropriate NSString compare options,
then taking each result and determining whether it falls on word
boundaries. If a given occurrence doesn't have the right word-
boundary characteristics, the search continues.
Douglas Davidson
DATE : Tue Jan 29 20:20:16 2008
On Jan 29, 2008, at 11:05 AM, Citizen wrote:
> You could get close with generating the characters you expect to
> find at the word boundaries with:
>
> NSCharacterSet * wordBoundriesCharacterSet = [[NSCharacterSet
> letterCharacterSet] invertedSet];
>
> You would need to change this accordingly if you did not want
> numbers to be considered as a word boundary. You could of course
> create a boundary character set with just whitespace and punctuation
> marks - it just depends on how you would like the final feature to
> work.
It's better not to reinvent the wheel for this sort of tokenization.
There is API for word-boundary analysis in AppKit (doubleClickAtIndex:
et al.), and, starting in Leopard, also in CoreFoundation
(CFStringTokenizer), that handles this in a consistent and standards-
appropriate fashion.
The current find panel implementation works by searching for the
string in question using any appropriate NSString compare options,
then taking each result and determining whether it falls on word
boundaries. If a given occurrence doesn't have the right word-
boundary characteristics, the search continues.
Douglas Davidson






Cocoa mail archive

