FROM : Jonathon Mah
DATE : Thu Jun 01 17:47:18 2006
Hi Philip,
On 2006-06-01, at 22:25, Philip Dow wrote:
> The spotlight docs clearly state that applications cannot request
> the kMDItemTextContent attribute from an MDItemRef. [...] Is there
> no way to work around this?
>
> /usr/bin/mdls does not list the attribute, but /usr/bin/mdimport at
> debug level 2 does. I suppose I could parse the returned string and
> extract the information, but I'm not very interested in doing this.
> Is there really no other way to get the spotlighted textual content
> of a file?
From what I understand, the text content differs from other
attributes in that it is not stored in the item's Spotlight record.
Instead it is used to hash the record in a table of words, so a
search for "macro" would return the items contained in that word ---
but there is no easy way to go backwards (to get the words from an
item). The text content string has a lot of redundancy anyway
(duplicated words, irrelevant whitespace, etc.). So Spotlight is
optimized for searching, not retrieving (in the case of text content).
Of course, mdimport can show it because it just imported it --- but
it's not stored in the metadata store. I think you may be out of
luck, unless calling mdimport is a viable option.
The spotlight-dev list may be more appropriate for this question, or
you might have more luck finding a solution there.
Jonathon Mah
<email_removed>
DATE : Thu Jun 01 17:47:18 2006
Hi Philip,
On 2006-06-01, at 22:25, Philip Dow wrote:
> The spotlight docs clearly state that applications cannot request
> the kMDItemTextContent attribute from an MDItemRef. [...] Is there
> no way to work around this?
>
> /usr/bin/mdls does not list the attribute, but /usr/bin/mdimport at
> debug level 2 does. I suppose I could parse the returned string and
> extract the information, but I'm not very interested in doing this.
> Is there really no other way to get the spotlighted textual content
> of a file?
From what I understand, the text content differs from other
attributes in that it is not stored in the item's Spotlight record.
Instead it is used to hash the record in a table of words, so a
search for "macro" would return the items contained in that word ---
but there is no easy way to go backwards (to get the words from an
item). The text content string has a lot of redundancy anyway
(duplicated words, irrelevant whitespace, etc.). So Spotlight is
optimized for searching, not retrieving (in the case of text content).
Of course, mdimport can show it because it just imported it --- but
it's not stored in the metadata store. I think you may be out of
luck, unless calling mdimport is a viable option.
The spotlight-dev list may be more appropriate for this question, or
you might have more luck finding a solution there.
Jonathon Mah
<email_removed>
| Related mails | Author | Date |
|---|---|---|
| Philip Dow | Jun 1, 14:55 | |
| Jonathon Mah | Jun 1, 17:47 |






Cocoa mail archive

