1

I have read a lot about this topic and I saw many different algorithms. I have stumbled across another solution which Im having a hard time to understand its efficiency compared to other algorithms, since its using a simple temporary object to hold the existing elements of the array. Is that a valid solution compared to the "old school" method using a sophisticated sorting method and comparison?

 function removeDup(arr){
        var element,
                existObj= {},
                finalArr = [];

        for(var i=0;i<arr.length;i++){
            element = arr[i];
            if(!existObj[element]){
                finalArr.push(element);
                existObj[element] = true;
            }
        }
        return finalArr;
    }
    //console.log(removeDup([2,2,2,2,4534,5,7,3]));
    console.log(removeDup(["mike","john","alex","mike","john"]));

A friend told me that the efficiency here cannot be clearly determined because i dont really know how was the temp object implemented.

5
  • 2
    Think of existObj as a hash map - with near O(1) performance for assigning and accessing. Commented Sep 29, 2014 at 9:58
  • It's easier using Array.filter Commented Sep 29, 2014 at 10:00
  • 1
    @Bergi : you're right, and since O(1)+constant === 0(1), we have not a 'near' O(1) for the lookup, but exactly O(1). So this algorithm is O(n). Now, as shown in my answer, we can get a 6-10X improvement on the 'k' of this O(n) by using the best-fit native object. Commented Sep 29, 2014 at 16:34
  • 1
    @GameAlchemist: I'm not talking about constants. Assigning and accessing elements in a hashmaps will always depend on (k) size of the key (e.g. length of the property string), (v) size of the stored values, and (n) number of items in the map - just with a very good complexity. (disclaimer: v is constant for booleans like here, or object pointers; k can be reduced by a lazy hashing function that doesn't visit all bytes; and n could be limited if the key space is finite) Commented Sep 29, 2014 at 16:46
  • 1
    @Bergi : i wanted to check for real that O(1) is good enough, so i did a plot (see my answer). Indeed O(1) is a good approximation, even if the spread raises with n. Commented Sep 30, 2014 at 10:23

2 Answers 2

1

You'll get the best performances by using most appropriate data structure. In a JIT/interpreted language like js, the gains of using a native functionality are tremendous.

Here it's a Set you should use in the first place : this way you don't even have to do anything to remove dups, they will just be ignored when added.
I just did a simple test, and performances are around six to ten times (!!) faster with a Set.

http://jsbin.com/jofofeyixaco/1/edit?js,console

result example :

"overhead is : 0.015700000221841037"
"Built unic numbers with a lookup object in : 6.237600000167731"
"Built unic numbers with a Set in : 0.7921500000520609"

Here are the curves for n = 0 to 50.000 for both algorithm.
We see that indeed the hashmap behaves quite like O(1), but with a higher spread when n raises.
Set is almost perfectly linear.

enter image description here

drawing jsbin (be patient ! ) : http://jsbin.com/jofofeyixaco/2/

Code :

// noprotect
// build a test set
var numbers = [];
var cnt = 10000; 
for (var i=0; i<cnt; i++ ) numbers.push(Math.floor(Math.random*1000));

// build unic values using lookup object
function buildWithObject() {
  var existing= {};
  var unicNumbers = [];
  for (var i=0; i<cnt; i++) {
    var num = numbers[i];
    if (!existing[num]) {
      unicNumbers.push(num);
      existing[num]=true;
    }
  }
}

// build unic values using a Set
function buildWithSet() {
    var unicNumbersSet = new Set();
    for (var i=0; i<cnt; i++) {
         var num = numbers[i];
         unicNumbersSet.add(num);
    }  
}

function iterate() {
    for (var i=0; i<cnt; i++) {
         var num = numbers[i];
    }    
}

// warming up functions
for (var i=0; i<30; i++) { buildWithObject(); buildWithSet() ;  iterate(); }

// --------  Measures  --------------------
var measureRepeat = 20;
var m;

var s,e;
// ----------------------------
m=measureRepeat;
s=window.performance.now();
while (m--) iterate();
e=window.performance.now();

console.log('overhead is : ' + (e-s)/measureRepeat);

// ----------------------------
m=measureRepeat;
s=window.performance.now();
while (m--) buildWithObject();
e=window.performance.now();

console.log('Built unic numbers with a lookup object in : ' + (e-s)/measureRepeat);

// ----------------------------
m=measureRepeat;
s=window.performance.now();
while (m--) buildWithSet();
e=window.performance.now();
console.log('Built unic numbers with a Set in : ' + (e-s)/measureRepeat);

(don't forget, Set is EcmaScript 6, so use, in the js tag, type="application/javascript;version=1.7"

If you're concerned about compatibility : http://kangax.github.io/compat-table/es6/#Set
All 'modern' platforms ok : Ch, FF, IE11, OS8
All others not ok. )

Sign up to request clarification or add additional context in comments.

4 Comments

Uh, yes, I would be concerned about compatibility of Set.
@Bergi : yes, this might be a concern depending on your target. I updated to summarize the compatibility table. ( yet within a few month all should be ok since Harmony is (somehow) welcome by everyone ).
Thanks for the great answer. So to sum all up, is it correct to assume that using an object as a lookup map is fine and the complexity is O(N). But, it is much faster to use a Set, but the complexity is still O(N)?
Yes, as you know O(N) means k*n+ctte, and the k constant for a Set is around 10 times smaller.
0
let existing = {};
let unicNumbers = [];   
arr.forEach((item) => {
        if (!existing[item.id]) {
            unicNumbers.push(item);
            existing[item.id] = true;
        }
    })

1 Comment

While this code may provide a solution to the question, it's better to add context as to why/how it works. This can help future users learn, and apply that knowledge to their own code. You are also likely to have positive feedback from users in the form of upvotes, when the code is explained.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.