Problem enumerating a directory

  • Hi all,

    Apologies for asking a question which has been asked many times before, but I can't seem to find an answer to this particular one.

    I'm trying to list a directory recursively to build up a snapshot of the contents and store them in a core data DB, but keep running into issues with the directory list.  The core data part is working fine as far as I can tell!

    I've tried writing a recursive method to call scandir() until the whole tree has been visited, but I'm coming unstuck converting NSStrings to const char* to char*.  Unless I supply the path directly as a hard-coded C-string "/Users/mark", it works for a while, but then sometimes it forgets to add "/Users/mark" and starts scanning directories at the root of my HD!  Hard coding the C string works perfectly but obviously isn't an option.

    -(void) scan: (char *)theDir{
        struct dirent **namelist;
        int n;
    size_t thisDirLength = strlen(theDir);

        n = scandir(theDir, &namelist, 0, NULL);
        if (n < 0){
            perror("scandir");
    }
        else {
            while(n--) {
      theCounter++;
      if (theCounter >= 1000) {
        theCounter = 0;
        [[self managedObjectContext] save:NULL];
        [[self managedObjectContext] reset];
        [thePool drain];
        thePool = [[NSAutoreleasePool alloc] init];
      }
      if ((strcmp(namelist[n]->d_name,".") != 0) && (strcmp(namelist[n]->d_name,"..") != 0)) {
        char* fullPath = malloc(thisDirLength + strlen(namelist[n]->d_name) + 2);
        strcpy(fullPath, theDir);
        strcat(fullPath, "/");
        strcat(fullPath, namelist[n]->d_name);

        [self addEntityWithPath:[NSString stringWithCString:fullPath encoding:NSUTF8StringEncoding]];

        if (namelist[n]->d_type == DT_DIR) {
        [self scan:fullPath];
        }
        free(fullPath);
      }
                free(namelist[n]);
            }
            free(namelist);
        }
    }

    I then gave up on that approach and opted for the easier but slower cocoa solution (NSDirectoryEnumerator) but for some reason it gives up with neither an error nor a warning about half way through the tree.  Could it be that modifications to the file system during the enumeration are causing it to fail?

    -(void) startSnapshotForPath:(NSString *) thePath {
    int theCounter = 0;
    NSDirectoryEnumerator *dirEnumerator = [[NSFileManager defaultManager] enumeratorAtPath: thePath];
    thePool = [[NSAutoreleasePool alloc] init];

    for (NSString *theSubPath in dirEnumerator) {
      [self addEntityWithPath:[thePath stringByAppendingPathComponent:theSubPath]];
      theCounter++;
      if (theCounter >= 1000) {
      theCounter = 0;
      [[self managedObjectContext] save:NULL];
      [[self managedObjectContext] reset];
      [thePool drain];
      thePool = [[NSAutoreleasePool alloc] init];
      }
    }

    [[self managedObjectContext] save:NULL];
    [[self managedObjectContext] reset];
    [thePool drain];
    }

    I've also tried Uli Kusterer's UKDirectoryEnumerator but that doesn't appear to be recursive!  I suspect (although I haven't tried) that requesting the type of the path (i.e. file/directory) and creating a new UKDirectoryEnumerator for each subdirectory would be massively expensive.

    Does anyone have any suggestions for where I can go from here please?  How can I find out why NSDirectoryEnumerator is failing half-way through the process, and how can I stop it doing so? Failing that, does anyone have a better suggestion for how I can build the snapshot please?

    Many thanks
    Mark
  • You should probably try -[NSFileManager enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:]. The errorHandler block will give you more detail about any errors that occur in the middle of the enumeration (and give you the ability to ignore them).

    -KP

    On Jul 27, 2012, at 9:44 AM, Mark Allan <markjallan...> wrote:

    > Hi all,
    >
    > Apologies for asking a question which has been asked many times before, but I can't seem to find an answer to this particular one.
    >
    > I'm trying to list a directory recursively to build up a snapshot of the contents and store them in a core data DB, but keep running into issues with the directory list.  The core data part is working fine as far as I can tell!
    >
    > I've tried writing a recursive method to call scandir() until the whole tree has been visited, but I'm coming unstuck converting NSStrings to const char* to char*.  Unless I supply the path directly as a hard-coded C-string "/Users/mark", it works for a while, but then sometimes it forgets to add "/Users/mark" and starts scanning directories at the root of my HD!  Hard coding the C string works perfectly but obviously isn't an option.
    >
    > -(void) scan: (char *)theDir{
    > struct dirent **namelist;
    > int n;
    > size_t thisDirLength = strlen(theDir);
    >
    > n = scandir(theDir, &namelist, 0, NULL);
    > if (n < 0){
    > perror("scandir");
    > }
    > else {
    > while(n--) {
    > theCounter++;
    > if (theCounter >= 1000) {
    > theCounter = 0;
    > [[self managedObjectContext] save:NULL];
    > [[self managedObjectContext] reset];
    > [thePool drain];
    > thePool = [[NSAutoreleasePool alloc] init];
    > }
    > if ((strcmp(namelist[n]->d_name,".") != 0) && (strcmp(namelist[n]->d_name,"..") != 0)) {
    > char* fullPath = malloc(thisDirLength + strlen(namelist[n]->d_name) + 2);
    > strcpy(fullPath, theDir);
    > strcat(fullPath, "/");
    > strcat(fullPath, namelist[n]->d_name);
    >
    > [self addEntityWithPath:[NSString stringWithCString:fullPath encoding:NSUTF8StringEncoding]];
    >
    > if (namelist[n]->d_type == DT_DIR) {
    > [self scan:fullPath];
    > }
    > free(fullPath);
    > }
    > free(namelist[n]);
    > }
    > free(namelist);
    > }
    > }
    >
    >
    > I then gave up on that approach and opted for the easier but slower cocoa solution (NSDirectoryEnumerator) but for some reason it gives up with neither an error nor a warning about half way through the tree.  Could it be that modifications to the file system during the enumeration are causing it to fail?
    >
    >
    > -(void) startSnapshotForPath:(NSString *) thePath {
    > int theCounter = 0;
    > NSDirectoryEnumerator *dirEnumerator = [[NSFileManager defaultManager] enumeratorAtPath: thePath];
    > thePool = [[NSAutoreleasePool alloc] init];
    >
    > for (NSString *theSubPath in dirEnumerator) {
    > [self addEntityWithPath:[thePath stringByAppendingPathComponent:theSubPath]];
    > theCounter++;
    > if (theCounter >= 1000) {
    > theCounter = 0;
    > [[self managedObjectContext] save:NULL];
    > [[self managedObjectContext] reset];
    > [thePool drain];
    > thePool = [[NSAutoreleasePool alloc] init];
    > }
    > }
    >
    > [[self managedObjectContext] save:NULL];
    > [[self managedObjectContext] reset];
    > [thePool drain];
    > }
    >
    > I've also tried Uli Kusterer's UKDirectoryEnumerator but that doesn't appear to be recursive!  I suspect (although I haven't tried) that requesting the type of the path (i.e. file/directory) and creating a new UKDirectoryEnumerator for each subdirectory would be massively expensive.
    >
    > Does anyone have any suggestions for where I can go from here please?  How can I find out why NSDirectoryEnumerator is failing half-way through the process, and how can I stop it doing so? Failing that, does anyone have a better suggestion for how I can build the snapshot please?
    >
    > Many thanks
    > Mark
  • Thanks very much for the suggestion.  I've just given that a try, but it doesn't make any difference.  The enumeration still stops early, but the error handler block doesn't get called, making me think there's no error; the enumeration simply thinks it's finished.

    Anything else I could try.  FWIW, I've just installed 10.8 and it's still happening.

    M

    On 27 Jul 2012, at 18:10, Kevin Perry <kperry...> wrote:

    > You should probably try -[NSFileManager enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:]. The errorHandler block will give you more detail about any errors that occur in the middle of the enumeration (and give you the ability to ignore them).
    >
    > -KP
    >
    > On Jul 27, 2012, at 9:44 AM, Mark Allan <markjallan...> wrote:
    >
    >> Hi all,
    >>
    >> Apologies for asking a question which has been asked many times before, but I can't seem to find an answer to this particular one.
    >>
    >> I'm trying to list a directory recursively to build up a snapshot of the contents and store them in a core data DB, but keep running into issues with the directory list.  The core data part is working fine as far as I can tell!
    >>
    >> I've tried writing a recursive method to call scandir() until the whole tree has been visited, but I'm coming unstuck converting NSStrings to const char* to char*.  Unless I supply the path directly as a hard-coded C-string "/Users/mark", it works for a while, but then sometimes it forgets to add "/Users/mark" and starts scanning directories at the root of my HD!  Hard coding the C string works perfectly but obviously isn't an option.
    >>
    >> -(void) scan: (char *)theDir{
    >> struct dirent **namelist;
    >> int n;
    >> size_t thisDirLength = strlen(theDir);
    >>
    >> n = scandir(theDir, &namelist, 0, NULL);
    >> if (n < 0){
    >> perror("scandir");
    >> }
    >> else {
    >> while(n--) {
    >> theCounter++;
    >> if (theCounter >= 1000) {
    >> theCounter = 0;
    >> [[self managedObjectContext] save:NULL];
    >> [[self managedObjectContext] reset];
    >> [thePool drain];
    >> thePool = [[NSAutoreleasePool alloc] init];
    >> }
    >> if ((strcmp(namelist[n]->d_name,".") != 0) && (strcmp(namelist[n]->d_name,"..") != 0)) {
    >> char* fullPath = malloc(thisDirLength + strlen(namelist[n]->d_name) + 2);
    >> strcpy(fullPath, theDir);
    >> strcat(fullPath, "/");
    >> strcat(fullPath, namelist[n]->d_name);
    >>
    >> [self addEntityWithPath:[NSString stringWithCString:fullPath encoding:NSUTF8StringEncoding]];
    >>
    >> if (namelist[n]->d_type == DT_DIR) {
    >> [self scan:fullPath];
    >> }
    >> free(fullPath);
    >> }
    >> free(namelist[n]);
    >> }
    >> free(namelist);
    >> }
    >> }
    >>
    >>
    >> I then gave up on that approach and opted for the easier but slower cocoa solution (NSDirectoryEnumerator) but for some reason it gives up with neither an error nor a warning about half way through the tree.  Could it be that modifications to the file system during the enumeration are causing it to fail?
    >>
    >>
    >> -(void) startSnapshotForPath:(NSString *) thePath {
    >> int theCounter = 0;
    >> NSDirectoryEnumerator *dirEnumerator = [[NSFileManager defaultManager] enumeratorAtPath: thePath];
    >> thePool = [[NSAutoreleasePool alloc] init];
    >>
    >> for (NSString *theSubPath in dirEnumerator) {
    >> [self addEntityWithPath:[thePath stringByAppendingPathComponent:theSubPath]];
    >> theCounter++;
    >> if (theCounter >= 1000) {
    >> theCounter = 0;
    >> [[self managedObjectContext] save:NULL];
    >> [[self managedObjectContext] reset];
    >> [thePool drain];
    >> thePool = [[NSAutoreleasePool alloc] init];
    >> }
    >> }
    >>
    >> [[self managedObjectContext] save:NULL];
    >> [[self managedObjectContext] reset];
    >> [thePool drain];
    >> }
    >>
    >> I've also tried Uli Kusterer's UKDirectoryEnumerator but that doesn't appear to be recursive!  I suspect (although I haven't tried) that requesting the type of the path (i.e. file/directory) and creating a new UKDirectoryEnumerator for each subdirectory would be massively expensive.
    >>
    >> Does anyone have any suggestions for where I can go from here please?  How can I find out why NSDirectoryEnumerator is failing half-way through the process, and how can I stop it doing so? Failing that, does anyone have a better suggestion for how I can build the snapshot please?
    >>
    >> Many thanks
    >> Mark
  • On 30 Jul 2012, at 10:48, Mark Allan wrote:

    > Thanks very much for the suggestion.  I've just given that a try, but it doesn't make any difference.  The enumeration still stops early, but the error handler block doesn't get called, making me think there's no error; the enumeration simply thinks it's finished.
    >
    > Anything else I could try.  FWIW, I've just installed 10.8 and it's still happening.

    Are you able to determine anything in common about the directories it’s apparently skipping? Any symlinks involved perhaps?
  • On 30 Jul 2012, at 11:37, Mike Abdullah <cocoadev...> wrote:
    > On 30 Jul 2012, at 10:48, Mark Allan wrote:
    >> Thanks very much for the suggestion.  I've just given that a try, but it doesn't make any difference.  The enumeration still stops early, but the error handler block doesn't get called, making me think there's no error; the enumeration simply thinks it's finished.
    >>
    >> Anything else I could try.  FWIW, I've just installed 10.8 and it's still happening.
    >
    > Are you able to determine anything in common about the directories it’s apparently skipping? Any symlinks involved perhaps?

    No, I can't see anything in common.  It seems to stop at a different stage every time I run the code!

    Mark
  • On Jul 27, 2012, at 12:44 PM, Mark Allan <markjallan...> wrote:
    > I'm trying to list a directory recursively to build up a snapshot of the contents and store them in a core data DB, but keep running into issues with the directory list.  The core data part is working fine as far as I can tell!

    Just for grins, what if you take out the Core Data part and just do an NSLog?

        NSLog(@"[%@] [%@]", thePath, theSubPath]);

    I'm wondering if there's any chance an exception is being thrown in the Core Data part and you're accidentally ignoring it. It might help to include a counter in the NSLog, in case the enumeration is dying after some particular number or multiple of iterations.

    Another thought: try running with NSZombie turned on. Maybe you have a memory error that you "get away with" for random amounts of time. Long shot, but I wonder in particular if thePath is getting clobbered somewhere, which *might* explain why it suddenly starts processing the root directory. (This is why I suggest NSLogging both thePath and theSubPath, separately, in case one of them is getting clobbered.)

    > I then gave up on that approach and opted for the easier but slower cocoa solution (NSDirectoryEnumerator) but for some reason it gives up with neither an error nor a warning about half way through the tree.  Could it be that modifications to the file system during the enumeration are causing it to fail?

    I notice you mention "/Users/mark" as a value you've tried hardcoding. Your home directory seems a pretty big dataset to test with, and it will certainly be modified during the test. What if you try a big but less-huge subdirectory that you know won't change during your test?

    A thought about why it seems to fail at random places -- I suspect NSDirectoryEnumerator isn't guaranteed to iterate in any particular order, so that *might* make the bug seem random. Another guess: are you using a thread (perhaps via NSOperation) to enumerate the directory? I'm guessing that's likely since you don't want to block your UI during this long operation. What if you do an enumeration, just for grins, on the main thread? Maybe have a button whose action is "testEnumeration:", and have it do nothing but enumerate and print NSLog statements?

    If you're using an NSOperation, is there any place in your code where you cancel the operation, and might be doing so accidentally? Or maybe the NSOperation is under-retained and getting dealloc'ed prematurely?

    When these "can't be happening" bugs occur, I find it sometimes helps to make it work *somewhere* and work forward from there to figure out what's not working in the place you really want. Throw away as many assumptions as possible, *especially* the very basic ones. Should it work with your home directory? Sure it should, but try with a smaller, unmutating directory anyway. Should it work in NSOperation? Sure it should, but try it on the main thread anyway. Should it work with the Core Data saves? Sure it should (well, if it's in a thread I'm not sure), but take that away too.

    (Now after all this, I bet you find the bug just as I hit Send... :))

    --Andy

    >
    >
    > -(void) startSnapshotForPath:(NSString *) thePath {
    > int theCounter = 0;
    > NSDirectoryEnumerator *dirEnumerator = [[NSFileManager defaultManager] enumeratorAtPath: thePath];
    > thePool = [[NSAutoreleasePool alloc] init];
    >
    > for (NSString *theSubPath in dirEnumerator) {
    > [self addEntityWithPath:[thePath stringByAppendingPathComponent:theSubPath]];
    > theCounter++;
    > if (theCounter >= 1000) {
    > theCounter = 0;
    > [[self managedObjectContext] save:NULL];
    > [[self managedObjectContext] reset];
    > [thePool drain];
    > thePool = [[NSAutoreleasePool alloc] init];
    > }
    > }
    >
    > [[self managedObjectContext] save:NULL];
    > [[self managedObjectContext] reset];
    > [thePool drain];
    > }
    >
    > I've also tried Uli Kusterer's UKDirectoryEnumerator but that doesn't appear to be recursive!  I suspect (although I haven't tried) that requesting the type of the path (i.e. file/directory) and creating a new UKDirectoryEnumerator for each subdirectory would be massively expensive.
    >
    > Does anyone have any suggestions for where I can go from here please?  How can I find out why NSDirectoryEnumerator is failing half-way through the process, and how can I stop it doing so? Failing that, does anyone have a better suggestion for how I can build the snapshot please?
    >
    > Many thanks
    > Mark
  • On Jul 30, 2012, at 8:28 AM, Andy Lee <aglee...> wrote:
    > Another thought: try running with NSZombie turned on.

    This could lead to painful swapping though if you're testing over zillions of files and never dealloc'ing. Maybe save as a last resort, or let it run for just a limited time in the hopes you'll find the bug before too much memory gets eaten up.

    --Andy
  • On Jul 30, 2012, at 2:48 AM, Mark Allan <markjallan...> wrote:

    > Thanks very much for the suggestion.  I've just given that a try, but it doesn't make any difference.  The enumeration still stops early, but the error handler block doesn't get called

    This sort of situation always makes me suspect an exception being raised. It might be internal to NSFileManager and get caught at a higher level, returning back to your code. (In which case it's a framework bug, and would be good to report to Apple.)

    Try setting an all-exceptions breakpoint (go to the breakpoints tab in the navigator, and press the + button at the bottom.)

    —Jens
previous month july 2012 next month
MTWTFSS
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
Go to today