Re: Spotlight -- Metadata importer vs file package containing rtfd files

  • Just an update on this topic in case anyone else is interested...

    It appears that Spotlight and the rest of the MD system have
    implemented a "path filter" that aggressively prevents the use of MD
    queries inside of Finder-opaque file packages.

    Thus, not only does SpotLight not search inside of Finder-opaque
    packages (which I can sort-of understand), but you can't even _force_
    anything inside such a package to be indexed by using any aspect of
    the public API that I have been able to find.

    As a result, an application that encapsulates standard documents
    (like rtf or rtfd files) inside a file-package cannot use
    NSMetadataQuery to retrieve metadata for the file package.

    This explains why the default mdimporter for RTFD files fails to
    index attachments to the rtfd file -- it can't use the OS to index
    the files, which could be anything.

    To provide a specific example:

    If you have an item, say ~/foo.bundle, and you create a query
    constrained to the bundle folder, the _content-indexing_ part of that
    query will always return zilch, whether you perform the query with
    the Spotlight UI or via NSMetadataQuery. Even if you attempt to index/
    search ~/foo.bundle/Contents, the query will return zilch for file
    content searches.

    You can easily demonstrate this for yourself using rtfd files with
    attachments and the "show package contents" contextual menu item in
    the Finder: If you open a Finder window on the RTFD package contents,
    the window's Find bar is suddenly incapable of finding the contents
    of the TXT.rtf file that contains the rtfd file's text when the
    search is constrained to window's directory. It can find the
    filenames, but ignores file-content.

    While I can certainly understand not having most opaque bundles
    search by default, particularly during Spotlight's initial mass-
    indexing phase, having the code-level API suffer from the same
    constraint appears to be dubious at best.

    Note that there does not appear to be any fundamental technical issue
    here -- the mdimport tool that does the indexing has a -f flag that
    "forces" it to index inside of opaque file packages. The only reason
    for the constraint appears to be that nobody considered this issue
    when the API was laid out, and nobody reconsidered it when the
    default rtfd importer had to be crippled (compared to previous
    versions of content-indexing, which indexed rtfd attachments) because
    of the API limitation.

    IMHO, the API should be augmented with a searchOpaqueDirectories flag.

    Until then, I would welcome anyone's knowledge of a confirmed work-
    around.
  • On May 4, 2005, at 11:53 AM, Kirk Kerekes wrote:

    > Until then, I would welcome anyone's knowledge of a confirmed work-
    > around.

    I was wondering the same thing, so I asked on the macosx-talk mailing
    list. Here's the solution to overriding the path filter so that
    directories like /usr can be indexed:
    <http://www.omnigroup.com/mailman/archive/macosx-talk/2005-May/
    015830.html
    >
    <http://www.omnigroup.com/mailman/archive/macosx-talk/2005-May/
    015831.html
    >

    As for getting Spotlight to index files inside bundles, I haven't
    figured out any way of getting it to automatically index files inside
    bundles. Spotlight should manually index bundles by using the
    mdimport command, however. If anyone can figure out how to get it to
    automatically search inside bundles, I'd like to know...

    Nick Zitzmann
    <http://www.chronosnet.com/>
previous month may 2005 next month
MTWTFSS
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
Go to today
MindNode
MindNode offered a free license !