Removing equivalent but unique objects from a Javascript array

Question

I have an array of objects similar to the following:

var routeArr = [
    {start: 1, end: 2},
    {start: 1, end: 3},
    {start: 1, end: 4},
    {start: 2, end: 1},
    {start: 3, end: 1},
    {start: 4, end: 1}
];

These objects represent the start and end point of lines and as such, {start: 1, end: 2} and {start: 2, end: 1} represent the same line.

I am trying to remove all duplicate lines from the array and cannot find an efficient or elegant way to do it. I have tried a nested loop but, I've been told that is bad practice (and I'm getting errors with my implementation, and it's just ugly).

for(var i = 0, numRoutes = routeArr.length; i < numRoutes; i++) {
    var primaryRoute = routeArr[i];

    for(var j = 0; j < numRoutes; j++) {
        var secondRoute = routeArr[j];

        if(primaryRoute.start === secondRoute.end && primaryRoute.end === secondRoute.start) {
            routeArr.splice(j, 1);
            continue;
        }
    }
}

Can anyone offer suggestions?

The normal way for doing this is: sort it firstly (in your case, you should reverse the start and end if (end > start)). Then the duplicate lines will be exactly the same with each other. Then just loop to remove the duplicate one — CY_
– CY_, Commented May 23, 2016 at 22:43
When you are removing an element of array never run loop from 0 to length. it is not safe because after removal you have to adjust your indices. Better run loop in descending order i.e. from length-1 to 0, this will work in case you will remove an element of array and never will come back elements with bigger indices. Also your if statement checks only one condition of being an identical lines, you have to add other check with or statement as well. — simon
– simon, Commented May 23, 2016 at 23:03

simon · Accepted Answer · 2016-05-24 15:32:55Z

3

Create an object/map in javascript and keep the indexes of the unique objects, store "min(start,end):max(start,end)" as a key and index as a value. Here is an implementation of your question in javascript:

// your initial array
var routeArr = [
    {start: 1, end: 2},
    {start: 1, end: 3},
    {start: 1, end: 4},
    {start: 2, end: 1},
    {start: 3, end: 1},
    {start: 4, end: 1}
];

// map where we will store key => value where key is a joined start,end of your array's item and value is an item index 
var keyToRouteIndexMap = {};

for (var i in routeArr){
    // calculating min and max from start and end to understand {start:1, end:2} and {start:2, end:1} object as duplicates
    var min = Math.min(routeArr[i].start,routeArr[i].end);
    var max = Math.max(routeArr[i].start,routeArr[i].end);
    // unique key 
    var key = min+':'+max;
    if (!keyToRouteIndexMap.hasOwnProperty(key)){
        keyToRouteIndexMap[key] = i;
    }
}

for(var key in keyToRouteIndexMap){
    if(keyToRouteIndexMap.hasOwnProperty(key)){
        console.log(routeArr[keyToRouteIndexMap[key]]);
    }
}

edited May 24, 2016 at 15:32

answered May 23, 2016 at 22:43

simon

1,4151 gold badge15 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Hamms Over a year ago

Note that if you are using ES6, you can instead use the Set object

andor kesselman Over a year ago

There's also a few javascript implementations built to imitate hashsets: github.com/timdown/jshashtable

le_m Over a year ago

@Hamms: How would the Set object help? It only compares object references.

le_m Over a year ago

@Vahan Simonyan: You should probably check for hasOwnProperty() when iterating over object keys.

Hamms Over a year ago

@le_m you obviously couldn't use the objects themselves as the elements of the Set, but you could construct a key the same way this example is and simply put it in a set rather than re-purposing an object to act as a set.

|

Redu · Accepted Answer · 2016-05-24 14:43:28Z

You can do like this. I guess this is very fast since there are no searches at all. One Array.prototype.reduce() operation to construct both the hash table (lookup table) and the reduced object at the same time. Then mapping the object keys to get the result. Here it is;

var routeArr = [
    {start: 1, end: 2},
    {start: 1, end: 3},
    {start: 1, end: 4},
    {start: 2, end: 1},
    {start: 3, end: 1},
    {start: 4, end: 1}
],

reduced = routeArr.reduce((p,c) => {!(p[c.start+"-"+c.end] || p[c.end+"-"+c.start]) && (p[c.start+"-"+c.end] = c);
                                     return p;},{}),
 result = Object.keys(reduced).map(e => reduced[e]);
console.log(result);

Well giving it a second thought i eliminated the redundant Object.keys() portion. Now this is nothing more than a single Array.prototype.reduce() pass all completed in just O(n). I suppose this might be as far as it gets concerning the performance. Check it out.

var routeArr = [
    {start: 1, end: 2},
    {start: 1, end: 3},
    {start: 1, end: 4},
    {start: 2, end: 1},
    {start: 3, end: 1},
    {start: 4, end: 1}
],

     reduced = routeArr.reduce((p,c) => {!(p[c.start+"-"+c.end]  ||
                                           p[c.end+"-"+c.start]) &&
                                          (p[c.start+"-"+c.end] = true,
                                           p.result.push(c));
                                           return p;
                                        },{"result":[]});
console.log(reduced.result);

Well ok yes i have to agree it looks a little cryptic but it is very simple.

We are using Array.prototype.reduce() method with an initial value here. This is our initial value {"result":[]}. When reducing our routeArr array our initial element to start with is now an object with a single property named result and value of an empty array.
reduce has been provided with an anonymous callback function which takes two arguments (p,c) p stands for previous and c stands for current. So in the first run p is our initializing object, i mean this {"result":[]} and c is the item at index 0 of the array (routeArr) that we have called reduce upon. So in the first round c is {start: 1, end: 2}.
In the beginning of every round we check if our p object contains a property which represent the current elements values in both orders. So the check comes like this !(p[c.start+"-"+c.end] || p[c.end+"-"+c.start]) which in human terms means "is it true that you don't have a string property like c.start-c.end or c.end-c.start".. So for example in the first round the check is like "is it true that you don't have a string property like "1-2" or "2-1". If it has (false) we do nothing but if it hasn't we perform the following actions;
&& (p[c.start+"-"+c.end] = true, p.result.push(c)); return p;. OK the first && ties the two instructions in the parens to the condition of the previous instruction to evaluate to true. In a && b instruction JS engine will only evaluate b if a evaluates to true. So you got it. Again in human terms this is what happens. "is it true that you don't have a string property like "1-2" or "2-1" turns true and we create a property "1-2" with a value true. So in next rounds if we meet a 1-2 or 2-1 we will do nothing at all. Then we push this current object to the result property of the same object (p.result) to become a unique representative of all of it's duplicates or twins. Then we return p for a healthy continuation of the reduce cycles.

I hope it is clear.

Beautiful functional solution. Now I would really like to see a performance comparison with the non-functional approaches.
I think now you overdid it a bit with the code-minification. Chosing self-documenting variable names and not putting assignments into comparisons might help OP to better understand your nice solution :)
@le_m yes i guess you might be right. I will put some explanation below. To me this looks like a poem though. :)

le_m · Accepted Answer · 2016-05-24 19:11:27Z

2

Here is a general solution to the problem of removing duplicate values from javascript arrays:

/**
 * Takes an input array and returns a new array without identical elements.
 *
 * @param {array} input
 * @callback id   identity function returning identical values for identical elements
 */
function uniquify(input, id) {
    result = [];
    map = {};
    for (var i = 0, length = input.length; i < length; ++i) {
        var element = input[i], identity = id(element);
        if (!map.hasOwnProperty(identity)) {
            result.push(element);
            map[identity] = true;
        }
    }
    return result;
}

Applied to your given routeArr:

var routeArr = [
    {start: 1, end: 2},
    {start: 1, end: 3},
    {start: 1, end: 4},
    {start: 2, end: 1},
    {start: 3, end: 1},
    {start: 4, end: 1}
];

routeArr = uniquify(routeArr, function(route) {
    return route.start < route.end ? '' + route.start + ':' + route.end : '' + route.end + ':' + route.start;
});

edited May 24, 2016 at 19:11

answered May 23, 2016 at 23:34

le_m

20.4k10 gold badges70 silver badges78 bronze badges

1 Comment

Redu Over a year ago

I guess the callback function maps {start: 2, end: 3} and {start: 6, end: 1} to the same location.

Iron Gremlin · Accepted Answer · 2016-05-24 22:06:38Z

2

Your nested loop methodology is 'ugly'- but that isn't your issue.

Your implementation errors are resulting from the fact that both of your for loops assume the array structure won't change as you're mutating it, which is causing you to skip over some items in the array.

'i' and 'j' are 'stupid' incrementers - That for loop isn't telling the code to go to the next item in the array with each iteration, it's telling it to go to (array[last_index_i_used+1] - So when you splice something the array you're looking at changes, and the next item in line gets passed over.

I see a lot of fancy array methods and ES6 suggestions, but I assume from your question that you are still a bit new to JS, and could use some time building fundamentals (no offense intended).

Try a recursive decrementing function:

function uniquify(inputArray, ind){
    var checkStart = inputArray[ind].start, checkEnd =inputArray[ind].end
    for (var i=(ind-1);i > -1; --i){
        var thisStart = inputArray[i].start, thisEnd = inputArray[i].end
        if ((thisStart == checkStart || thisStart == checkEnd) && (thisEnd == checkStart || thisEnd == checkEnd)){

            inputArray.splice(i,1)
        }
    }

    --ind
    if (ind > -1){
        uniquify(inputArray,ind)
    }
}
uniquify(routeArr,routeArr.length -1);

I like that better than a nested for loop as you are never hitting the same value more often than you need to, which keeps performance consistent regardless of the size of your array.

But you might want to ask yourself if whatever is defining 'routeArr' is doing whatever it's doing in a way that is intelligent - At best, it seems like it's wasting memory and CPU storing data in an inefficient way.

edited May 24, 2016 at 22:06

answered May 24, 2016 at 0:47

Iron Gremlin

4072 silver badges11 bronze badges

3 Comments

le_m Over a year ago

I recommend to use semicolons everywhere even though they are not always needed in JS. Also, your loop index i is a global variable, better make it local with var.

ewokthegreat Over a year ago

You are correct in assuming that I am new to JS, at least in a project of this size. Long story short, we're rewriting an old C++ app to run on iPad and Android. The decision was made to use HTML5 and Javascript with Cordova (which I supported at first but, now have my misgivings about). Unfortunately, some design decisions were made in the previous C++ version that forces us to work with the application data in unconventional ways. I'm trying to find an elegant and efficient solution. Thank you for your response, I'm trying it as we speak.

Iron Gremlin Over a year ago

@le_m - Thanks for catching the lack of var in my loop. Edited to fix.

Mina Jacob · Accepted Answer · 2016-05-24 00:42:10Z

1

I've wrote the function below to do it neatly

var routeArr = [{
  start: 1,
  end: 2
}, {
  start: 1,
  end: 3
}, {
  start: 1,
  end: 5
}, {
  start: 2,
  end: 1
}, {
  start: 3,
  end: 1
}, {
  start: 4,
  end: 1
}];

routeArr.IsDuplicate = function(obj) {
    var i = this.length;
    var count = 0 
    while (i--) {
        if ((this[i].start === obj.start && this[i].end === obj.end ) || (this[i].start === obj.end && this[i].end === obj.start) ) {
            count++;
        }
    }
    return count>1;
}

for(var i = routeArr.length-1; i--;){
    if (routeArr.IsDuplicate(routeArr[i])) routeArr.splice(i, 1);
}

edited May 24, 2016 at 0:42

answered May 24, 2016 at 0:36

Mina Jacob

1,97124 silver badges23 bronze badges

2 Comments

andor kesselman Over a year ago

This is operationally inefficient. It requires you to evaluate each pair multiple times. If you increase the routeApp length to 19 for example, 189 evaluations are made. Map seems to be a cleaner approach. In Java, a Hashmap would be a great data structure for this type of implementation. github.com/timdown/jshashtable is a hashset implementation in javascript.

le_m Over a year ago

javascript objects already offer a 'hashmap' implementation, the only issue is that keys cannot be objects - so you need a 'hash' function either by a general mapping of objects to hashes (your link) which is inefficient as you would need to iterate through the prototype chain or by a more performant user-supplied function.

Collectives™ on Stack Overflow

Removing equivalent but unique objects from a Javascript array

5 Answers 5

6 Comments

3 Comments

1 Comment

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

6 Comments

3 Comments

1 Comment

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related