Why ArrayList is faster than HashSet in java?

Question

I used to thing that HashSet is a pretty fast data structure implementation because it uses hashes (and is implemented via HashMap in its turn). I was solving some problems and decided to check performance issue, so here it is:

You are given an array with numbers - [11, 3, 11, 11, 3, 2, 0, -2, 2] You are supposed to write a function that returns the number that appears "odd" number of times.

Here is my solution:

public class OddNumInArray {

public static List<Integer> oddNumList(int [] ar){
    Collection <Integer> l = new ArrayList<>();
    for (int n: ar) {
        if (l.contains(n)) {
            l.remove(n);
        }
        else {
            l.add(n);
        }
    }
    return (List) l;
}

public static Set<Integer> oddNumHSet(int [] ar){
    Set <Integer> l = new HashSet<>();
    for (int n: ar) {
        if (l.contains(n)) {
            l.remove(n);
        }
        else {
            l.add(n);
        }
    }
    return l;
}

public static void main(String [ ]arg) {
    int [] a1 = new int [10000000];
    for (int i=0; i<10; i++){
        a1[i]=(new Random()).nextInt(5);
    }
    long cur= System.nanoTime();
    System.out.println(oddNumList(a1));
    long c1 = System.nanoTime()-cur;
    System.out.println("TIME CONSUMED:" +c1);
    cur= System.nanoTime();
    System.out.println(oddNumHSet(a1));
    long c2 = System.nanoTime()-cur;
    System.out.println("TIME CONSUMED:" + c2);
    System.out.println("c1/c2*100: "+ (new Double(c1)/new Double(c2)*100));
}

}

And here is an output:

[1, 0]
TIME CONSUMED:101804000
[0, 1]
TIME CONSUMED:183261000
c1/c2*100: 55.55137208680516

So, why is implementation with ArrayList is quicker than one with HashSet by almost 2 times? Thank you.

Why is it duplicate? I ask for a link then. I searched; this question was not asked before. — George Revkov
– George Revkov, Commented Oct 29, 2014 at 6:11
@chrylis - I think this must be a duplicate of ArrayList vs HashSet instead of how to do Microbenchmarking? — TheLostMind
– TheLostMind, Commented Oct 29, 2014 at 6:15

TheLostMind · Accepted Answer · 2014-10-29 06:07:51Z

9

ArrayList doesn't have code to check for duplicates. So, it just adds elements as and how you try to add them. A HashSet on the other hand is meant to have only unique elements, so it makes a check to prevent insertion of duplicate elements.

answered Oct 29, 2014 at 6:07

TheLostMind

36.3k12 gold badges72 silver badges109 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

George Revkov Over a year ago

So, if I add my own implementation of HashCollection_name via HashMap it will be faster, right?

furkle Over a year ago

@GeorgeRevkov That doesn't make any sense - a HashMap by definition can't have duplicate keys, because that's the way they're looked up. A specific hash-code is made for each item - what does it mean if you have two items with the same hash code? How do you find the second item?

TheLostMind Over a year ago

@GeorgeRevkov - Actually, a HashSet uses a HashMap behind the scenes.. So, it will not be faster.. ArrayList uses an Array behind the scenes, so it will be faster than both HashMap as well as HashSet.

George Revkov Over a year ago

I am talking about the way how HashSet is implemented. It uses value added to HashSet as a key for HashMap and then adds a default value to map. If I understand correctly, the most time-consuming moment is to look whether is has dups or not, which is implemented in HashSet(am I right here?). If I implement HashSet, but without checking for dups, I will save time, am I?

TheLostMind Over a year ago

@GeorgeRevkov - It will still be slower than an ArrayList :P

|

Collectives™ on Stack Overflow

Why ArrayList is faster than HashSet in java?

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related