Skip navigation.
 
mlRe: Indexing text, pdf, .doc
FROM : Joseph Heck
DATE : Sat Nov 02 01:30:18 2002

It would require more setup and programming effort, but you might also
consider Lucene (java/Apache). I'm not immediately aware of something
to scan MSWord files and index their contents, but I'd suspect such an
addition has been made for lucene, and it's indexing and
full-text-search mechanisms are quite advanced.

-joe

On Friday, November 1, 2002, at 04:17  PM, Michael Johnston wrote:

> Not sure about an embeddable library, but here are a couple good
> standalone indexing/search engines:
>
> http://htdig.org/ is open source, and is perl
> http://alkaline.vestris.com is very fast and inexpensive, but not open
> source
>
> PDF and Word should be converted to text or html and the resulting
> file indexed. Everyone uses http://www.foolabs.com/xpdf/ for pdf;
> htdig has a doc2html script for word.
>
> Michael Johnston
>
>
> On Friday, November 1, 2002, at 06:12 PM, Steve Ivy wrote:
>

>> I'm doing some research for an app and one of the things I need is
>> the ability to index (and subsequently search, obviously) a store of
>> content in text documents, pdf files, and Word documents. It can be
>> Java or Obj-C. I prefer not to use straight C simply due to my own
>> limitations in the language. I'm wondering if anyone has knowledge of
>> anything like this. What is Apple using in Sherlock/iTunes/etc?
>> Whatever became of AIAT?
>>
>> TIA,
>>
>> --Steve
>> _______________________________________________
>> cocoa-dev mailing list | <email_removed>
>> Help/Unsubscribe/Archives:
>> http://www.lists.apple.com/mailman/listinfo/cocoa-dev
>> Do not post admin requests to the list. They will be ignored.

> _______________________________________________
> cocoa-dev mailing list | <email_removed>
> Help/Unsubscribe/Archives:
> http://www.lists.apple.com/mailman/listinfo/cocoa-dev
> Do not post admin requests to the list. They will be ignored.

_______________________________________________
cocoa-dev mailing list | <email_removed>
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

Related mailsAuthorDate
mlIndexing text, pdf, .doc Steve Ivy Nov 1, 18:12
mlRe: Indexing text, pdf, .doc Michael Johnston Nov 2, 01:17
mlRe: Indexing text, pdf, .doc Joseph Heck Nov 2, 01:30
mlRe: Indexing text, pdf, .doc Marco Scheurer Nov 2, 02:06
mlRe: Indexing text, pdf, .doc Scott Anguish Nov 2, 04:24
mlRe: Indexing text, pdf, .doc Scott Anguish Nov 2, 04:25