[question] Checking for duplicate entries in NSMutableArray

  • Hi *,

    what would be an efficient and smart way to check for duplicate
    entries in an NSMutableArray? The objects of this array are of a
    class, that has only a bunch of strings as variables (i.e. firstName,
    lastName, memberOfGroup...).

    I've read about using NSDictionary when no duplicates are wanted at
    all, however, my case lies differently: duplicates must be allowed
    but should be affirmed by the user, namely in the case, that there
    actually is someone of the same name in the same group...yet...still
    another person.

    So i need a way to quickly and efficiently check the entire Array for
    an object which values are identical to the ones that just were
    entered. My thoughts were drifting into some sort of *creating a hash
    over the strings and store that in an additional NSString
    variable*...but i'm not sure if that's really efficient (checking the
    entire Array for String comparisons upon the creation (--> insertion)
    of a new object) and also i wouldn't be sure how to go about that).

    As always, i appreciate any feedback and have nothing but <3 for this
    list!

    Gute Nacht aus Berlin,

    Malte Philipp A.
  • In your class implement - (BOOL) isEqual: (id) other such that it does
    a string compare against firstName, lastName, memberOfGroup, etc.

    Check if an object equal to the object you want to insert already
    exists in the array with:

    if ( [arrayOfPeople containsObject: myNewPerson] )
    {
        // prompt the user to add a duplicate
    }
    {
        [arrayOfPeople addObject: myNewPerson];
    }

    That'll do it. Get fancy with optimizing - (BOOL) isEqual: (id) other
    if and when it becomes a bottleneck - otherwise keep things as simple
    as possible.

    When you implement -(BOOL) isEqual: (id) other you should also
    implement - (unsigned) hash. In this case I suggest it should be as
    simple as: (and, yeah, this ignores group membership for the sake of
    simplicity):

    - (unsigned) hash
    {
    return [[NSString stringWithFormat: @"%@%@", [self firstName], [self
    lastName]] hash];
    }

    If you ever determine that the case of adding a new person to your
    array is slow then look into optimizing isEqual and hash. You can do
    this quite simply: when your person mutates (name change, group
    membership change, or they become irradiated with gamma rays) then
    recalculate the hash you've got for them. In the "optimized" case
    isEqual will just compare the two objects hashes and the hashes will
    only be updated when the instance mutates.

    Don't do that off the bat - introducing state (the hash) to an object
    should be avoided and should only be employed when it's needed. You
    will screw up at one point and forget to mark the hash as needing an
    update and it'll end up biting you. Avoid that until it's necessary.
    Less code is the best code.

    From the brief description you give of the problem speed will not be
    an issue. You're talking about an action that, if not triggered by a
    user action then, at least, requires some user intervention. Unless
    your arrayOfPeople is going to hold millions of people then don't even
    think about getting cute and optimizing until you know (through Shark
    or some other performance analysis tool) that you need to do so.

    Hope that helps,
    Guy

    On 6-Oct-07, at 8:51 PM, <malte.philipp...> wrote:

    > Hi *,
    >
    > what would be an efficient and smart way to check for duplicate
    > entries in an NSMutableArray? The objects of this array are of a
    > class, that has only a bunch of strings as variables (i.e.
    > firstName, lastName, memberOfGroup...).
    >
    > I've read about using NSDictionary when no duplicates are wanted at
    > all, however, my case lies differently: duplicates must be allowed
    > but should be affirmed by the user, namely in the case, that there
    > actually is someone of the same name in the same group...yet...still
    > another person.
    >
    > So i need a way to quickly and efficiently check the entire Array
    > for an object which values are identical to the ones that just were
    > entered. My thoughts were drifting into some sort of *creating a
    > hash over the strings and store that in an additional NSString
    > variable*...but i'm not sure if that's really efficient (checking
    > the entire Array for String comparisons upon the creation (-->
    > insertion) of a new object) and also i wouldn't be sure how to go
    > about that).
    >
    > As always, i appreciate any feedback and have nothing but <3 for
    > this list!
    >
    > Gute Nacht aus Berlin,
    >
    > Malte Philipp A.
  • On 07/10/2007, at 10:51 AM, <malte.philipp...> wrote:

    > Hi *,
    >
    > what would be an efficient and smart way to check for duplicate
    > entries in an NSMutableArray? The objects of this array are of a
    > class, that has only a bunch of strings as variables (i.e.
    > firstName, lastName, memberOfGroup...).
    >
    > I've read about using NSDictionary when no duplicates are wanted at
    > all, however, my case lies differently: duplicates must be allowed
    > but should be affirmed by the user, namely in the case, that there
    > actually is someone of the same name in the same
    > group...yet...still another person.
    >
    > So i need a way to quickly and efficiently check the entire Array
    > for an object which values are identical to the ones that just were
    > entered. My thoughts were drifting into some sort of *creating a
    > hash over the strings and store that in an additional NSString
    > variable*...but i'm not sure if that's really efficient (checking
    > the entire Array for String comparisons upon the creation (-->
    > insertion) of a new object) and also i wouldn't be sure how to go
    > about that).
    >
    > As always, i appreciate any feedback and have nothing but <3 for
    > this list!
    >
    > Gute Nacht aus Berlin,
    >
    > Malte Philipp A.

    One option is that you use a dictionary with arrays as values. For
    example (typed in Mail):

    id v = [myDict objectForKey:key];

    if (v) {
      if (shouldAddIfDuplicate) {
        if ([v isKindOfClass;[NSMutableArray class]])
          [v addObject:newObject];
        else
          [myDict setObject:[NSMutableArray arrayWithObjects:v,
    newObject, nil] forKey:key];
    } else
      [myDict setObject:newObject forKey:key];

    In the example above I only store them as arrays when there are more
    than one but you could always store arrays for the values.

    - Chris
  • Guy's suggestions are all good. The one thing I personally would
    change is the hash implementation:

    > - (unsigned) hash
    > {
    > return [[NSString stringWithFormat: @"%@%@", [self firstName],
    > [self lastName]] hash];
    > }

    Simple is good, but hashing should be fast, and creating a temporary
    string is unnecessary. I'd just combine the hashes using a bitwise or
    arithmetic operator, eg:

    - (unsigned) hash
    {
    return [[self firstName] hash] ^ [[self lastName] hash];
    }

    ~Martin
previous month october 2007 next month
MTWTFSS
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        
Go to today