FROM : John Stiles
DATE : Tue Jan 29 20:45:15 2008
BTW, please ignore the crazy slashes in the email I just sent.
Thunderbird thinks that slashes around a word means "italic". I think
it's just confusing.
John Stiles wrote:
> Douglas Davidson wrote:
>>
>> On Jan 29, 2008, at 11:05 AM, Citizen wrote:
>>
>>> You could get close with generating the characters you expect to
>>> find at the word boundaries with:
>>>
>>> NSCharacterSet * wordBoundriesCharacterSet = [[NSCharacterSet
>>> letterCharacterSet] invertedSet];
>>>
>>> You would need to change this accordingly if you did not want
>>> numbers to be considered as a word boundary. You could of course
>>> create a boundary character set with just whitespace and punctuation
>>> marks - it just depends on how you would like the final feature to
>>> work.
>>
>> It's better not to reinvent the wheel for this sort of tokenization.
>> There is API for word-boundary analysis in AppKit
>> (doubleClickAtIndex: et al.), and, starting in Leopard, also in
>> CoreFoundation (CFStringTokenizer), that handles this in a consistent
>> and standards-appropriate fashion.
>>
>> The current find panel implementation works by searching for the
>> string in question using any appropriate NSString compare options,
>> then taking each result and determining whether it falls on word
>> boundaries. If a given occurrence doesn't have the right
>> word-boundary characteristics, the search continues.
>>
> Excellent, thank you for the information.
>
> Actually, I've done a few brief tests with the Find panel (in Leopard)
> and it appears to be fairly broken anyway :( Here's an example. Open
> up TextEdit and, in a new document, type:
>
> /this is a test.
>
> /Now search for "is a" with "Full word" selected. It won't work. Or
> search for "this " with a trailing space, or "test." with the trailing
> period. It also won't work. As far as I can tell, "Full word" only
> succeeds when you search for a /single word/ with no punctuation or
> spaces of any kind. This is a little more limited than what I was
> hoping for, so maybe I really do need to roll my own implementation.
>
> Oh well, off to Radar to file a bug on the Find panel, and I'll figure
> out some sort of solution. I can probably use NSCharacterSet or
> something and look at the characters on either side of the found text.
> I was hoping to avoid that, but it looks like I can't.
>
>
> _______________________________________________
>
> Cocoa-dev mailing list (<email_removed>)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/cocoa-dev/<email_removed>
>
> This email sent to <email_removed>
DATE : Tue Jan 29 20:45:15 2008
BTW, please ignore the crazy slashes in the email I just sent.
Thunderbird thinks that slashes around a word means "italic". I think
it's just confusing.
John Stiles wrote:
> Douglas Davidson wrote:
>>
>> On Jan 29, 2008, at 11:05 AM, Citizen wrote:
>>
>>> You could get close with generating the characters you expect to
>>> find at the word boundaries with:
>>>
>>> NSCharacterSet * wordBoundriesCharacterSet = [[NSCharacterSet
>>> letterCharacterSet] invertedSet];
>>>
>>> You would need to change this accordingly if you did not want
>>> numbers to be considered as a word boundary. You could of course
>>> create a boundary character set with just whitespace and punctuation
>>> marks - it just depends on how you would like the final feature to
>>> work.
>>
>> It's better not to reinvent the wheel for this sort of tokenization.
>> There is API for word-boundary analysis in AppKit
>> (doubleClickAtIndex: et al.), and, starting in Leopard, also in
>> CoreFoundation (CFStringTokenizer), that handles this in a consistent
>> and standards-appropriate fashion.
>>
>> The current find panel implementation works by searching for the
>> string in question using any appropriate NSString compare options,
>> then taking each result and determining whether it falls on word
>> boundaries. If a given occurrence doesn't have the right
>> word-boundary characteristics, the search continues.
>>
> Excellent, thank you for the information.
>
> Actually, I've done a few brief tests with the Find panel (in Leopard)
> and it appears to be fairly broken anyway :( Here's an example. Open
> up TextEdit and, in a new document, type:
>
> /this is a test.
>
> /Now search for "is a" with "Full word" selected. It won't work. Or
> search for "this " with a trailing space, or "test." with the trailing
> period. It also won't work. As far as I can tell, "Full word" only
> succeeds when you search for a /single word/ with no punctuation or
> spaces of any kind. This is a little more limited than what I was
> hoping for, so maybe I really do need to roll my own implementation.
>
> Oh well, off to Radar to file a bug on the Find panel, and I'll figure
> out some sort of solution. I can probably use NSCharacterSet or
> something and look at the characters on either side of the found text.
> I was hoping to avoid that, but it looks like I can't.
>
>
> _______________________________________________
>
> Cocoa-dev mailing list (<email_removed>)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/cocoa-dev/<email_removed>
>
> This email sent to <email_removed>






Cocoa mail archive

