8

I have two arrays, an NSMutableArray and an NSArray. The NSMutableArray is the "store", it stores results from a source of NSArrays. Every 5 minute, a new NSArray comes in and the data needs to be filtered and sorted.

Sorting by date is pretty easy, so I managed to get the NSArray sorted by NSDate. Sorting the other array is not necessary, as it would only cause confusion with the user.

What I want to do: the NSArray has a lot of different objects that all respond to -[object name], returning an NSString. The NSArray needs to be merged into the NSMutableArray, only adding new objects.

The merging itself is no problem, but performance is. The NSMutableArray can contain up to 3000 items, and the NSArray can contain up to 250 items, although usually only 5 or 6 of these have to be merged into the NSMutableArray.

So, my question is: how do you merge two arrays in Objective-C, filtering the duplicates, without iterating (250*3000) times?

Tom

Edited to clarify something
The "duplicate" objects are objects that are duplicate to the user but not to the code. They have the same name, but not the same address.

More clarification: @"value" != @"value" // true

6 Answers 6

10

Is name a property of the objects being stored in the arrays? If so, you could use a fairly simple NSPredicate to filter the immutable array before adding the results to the mutable one. Here's an example:

NSPredicate *predicate = [NSPredicate predicateWithFormat:@"NONE name == %@.name", mutableArray];
resultsArray = [immutableArray filteredArrayUsingPredicate:predicate];
[mutableArray addObjectsFromArray:immutableArray];
Sign up to request clarification or add additional context in comments.

2 Comments

This solution looks good, but is probably only a bit less of a performance-eater than simply iterating through all objects manually. I'm currently using this as a temporary solution, until I have implemented a better way of doing this.
For an obscure reason, it is raising an Exception (iOS7): 'The left hand side for an ALL or ANY operator must be either an NSArray or an NSSet.' It's working fine when reversing the order of the predicate string: @"NONE %@.name == name"
6

How about this:

[mutable removeObjectsInArray:newArray];
[mutable addObjectsFromArray:newArray];

It isn't the fattest, but is easy to implement :)

1 Comment

Only works if the actual objects are the same. Won't work here, because a property needs to be the same to have a duplicate.
0

Edited to remove some stupidity (left plenty, though)

A couple of options:

  1. Remove all matching objects from the NSMutableArray using removeObjectIdenticalTo. This requires iterating through the smaller array, but as you note they're commonly small. Then,

  2. Add all of the items from the new array using addObjectsFromArray

Or... well, it actually might be faster to instead:

  1. Iterate through the new array looking for matches with indexOfObjectIdenticalTo, using addObject to add in non-matching objects.

Costly either way, but doable.

1 Comment

This won't do what I need: the objects aren't identical, only the value of -[object name] is
0

I would probably start by creating a new mutable array which contains the contents of your NSMutableArray and NSArray. Then, sort the new array based on the name property and then run through the array once, only pulling out the unique items.

2 Comments

I think it would slightly (!) reduce the amount of calculations needed, yes, but the NSMutableArray should remain the same and only add new objects.
Actually, the reduction in the number of calculations would be rather dramatic. From 3000 * 250 = 750,000 calculations, you would go to ~ 40,000....that's nearly two orders of magnitude improvement. The predicate method is cleaner, but I doubt it will be much faster. You could probably improve things a bit if you wrote your own sort algorithm which could remove an item from consideration whenever it compared two items that were equivalent.
0

Can you use NSSet and NSMutableSet instead? That could help deal with the duplicates issue.

Edit:

Based on your comments, you could use an NSSet to check for object membership quickly, in addition to your array. It'd require a bit more memory, but if you don't mind that, it could allow you to check really fast. You'd have your NSMutableArray backing store, and then an NSSet to keep track of object membership. You'd maintain the invariant that the NSMutableArray does not contain duplicates. You could use code like this:

// Assume that arrayStore is an NSMutableArray * instance variable
// Also, storeSet is an NSMutableSet * ivar

- (void)addObjectsFromArray:(NSArray *)data
{
    for (id item in data) {
        if (![storeSet member:item]) {
            // Will have to keep arrayStore sorted somehow
            [arrayStore addObject:item];
            [storeSet addObject:item];
        }
    }
}

You only have to iterate through the NSArray. I'm not sure how NSSet is implemented off the top of my head, but checking for membership won't be an O(n) operation like it is for an unsorted array.

It's not the most efficient method, but it works well with what you already have in place, with minor modifications.

3 Comments

NSSet is 1) Unordered and 2) Only works for adding the same object twice. Two identical objects at different addresses won't be seen by NSSet.
@Tom van der Woerdt: Sets use isEqual to compare objects, so if your class overrides isEqual, you can compare based on something other than memory location. Also, a set is unordered, but you can turn a set into an array and sort it when needed (unless you need it to be sorted all the time).
Oh, I didn't know about the isEqual part, sounds good. However, yes, they have to be sorted all the time.
0

There are likely many ways to dramatically improve performance, but to be able to suggest any, we really need to know more about what the objects in the arrays "are": what do they represent? How are they being used? (For example, are the items in the store array being displayed in a table view?)

NSMutableDictionary, NSMutableSet, etc. could be combined with NSMutableArray to organize and implement the model in an efficient manner.

For example, let's say we know the object represents a person: MDPerson. A person has a gender, a date of birth, a name, a unique id, and a set of attributes that can change. Given this higher level understanding of what the object represents, we know that 2 people are equal only if their unique ids are the same (in other words, 2 different people can have the same name, gender, and date of birth). Let's say that your main NSMutableArray is made up of a list of 3000 people. The incoming array is made up of 500 people which are already in the main NSMutableArray. A few of these 500 people instances might have "updated" attributes, which means that their instance in the main array needs to be updated with that info.

Given that understanding, it's clear that the main list should be implemented as an NSMutableDictionary rather than an NSMutableArray. In the dictionary, the person's unique id would be the key, and their person instance would be the value for the key. You could then loop through the incoming array of 500 persons only once:

 // main dictionary is called personIDsAndPersons

 for (MDPerson *person in incomingPersons) {
      MDPerson *existingPerson = [personIDsAndPersons objectForKey:[person uniqueID]];
      // if nil, the person doesn't exist
      if (existingPerson) {
          // update the existing person's attributes
          [existingPerson setUniqueAttributes:[person uniqueAttributes]];
      }
 }

Again, without knowing more of the details or having a higher level understanding of what the objects are, we're really just shooting in the dark.

You mention that 2 items are only the same if they have the same name. So, does that mean that each item in the main array of 3000 objects each have a unique name? If so, you could use an NSMutableDictionary to allow access to the objects in an efficient manner by having the keys in the dictionary be the name and the values be the object instance. You could then use a separate NSMutableArray that's used merely for display purposes: it allows an ordered, sorted organization of the same objects that are stored in the NSMutableDictionary. Remember that when you add an object to an array or a dictionary, normally you're not creating a new copy, you're just retaining the existing object.

1 Comment

All items are immutable objects, all subclassed from one "item" class which requires all subclasses to implement a "name" method to allow checking for duplicates. And yes, they are all in a sort of table view.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.