0

I know now a days inbuilt utilities available like HashCodeBuilder from Apache commons lang but i was trying to understand how to implement it myself and came across the example of hascode function for Employee class at http://en.wikipedia.org/wiki/Java_hashCode()

Everywhere on google, same kind of technique is suggested like multiplying non zero value with odd prime number and then summing it with instance variable(do it for instance variable).

Questions:-

1)why we can't return employeeId as hascode becoz it will aways be unique. Its simple and serves the hascode purpose. Agreed if it is not unique probably we need that kind of technique. Is that right?

2)Even If employee id is not unique, why its suggested to multiply with odd prime number? why taking any damn integer is not considered good?

Update:-

Peter i ran the example you mentioned it printed

[0, 32, 64, 96, 128, 160, 192, 224, 288, 256, 352, 320, 384]

[0, 32, 64, 96, 128, 160, 192, 224, 288, 256, 352, 320, 384]

i assume that output for now as yoy expected to understand the concept as you mentioned in your answer

[373, 343, 305, 275, 239, 205, 171, 137, 102, 68, 34, 0]

[0, 34, 68, 102, 137, 171, 205, 239, 275, 305, 343, 373]

As you suggested in your comment that this example demonstrated even unique hashcode can end up in same bucket. How this example demonstrated this behaviour? Do you mean 373 for integers and 0 for integers2 end up in same bucket ?

How prime number is helping in this example and how 34 would not have helped?

1 Answer 1

2

why we can't return employeeId as hascode becoz it will aways be unique. Its simple and serves the hascode purpose. Agreed if it is not unique probably we need that kind of technique. Is that right?

It's uniqueness is not important. Multiplying by a prime is a good way of merging multiple fields into one hashCode, but it sounds like you only have one, so it wont make much difference.

Even If employee id is not unique, why its suggested to multiply with odd prime number? why taking any damn integer is not considered good?

If you multiply by an even number what will the lowest bit of the hashCode be? How random/useful is it?


Note: every hashCode() for an Integer is unique, but get the right combination of integer values and when they are reduced to the capacity of a HashMap, they actually map to the same bucket. In this example, the entries appear in the reverse order they were added because every entry mapped to the same bucket.

HashSet<Integer> integers = new HashSet<>();
for (int i = 0; i <= 400; i++)
    if ((hash(i) & 0x1f) == 0)
        integers.add(i);
HashSet<Integer> integers2 = new HashSet<>();
for (int i = 400; i >= 0; i--)
    if ((hash(i) & 0x1f) == 0)
        integers2.add(i);
System.out.println(integers);
System.out.println(integers2);


static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

prints

[373, 343, 305, 275, 239, 205, 171, 137, 102, 68, 34, 0]
[0, 34, 68, 102, 137, 171, 205, 239, 275, 305, 343, 373]
Sign up to request clarification or add additional context in comments.

13 Comments

"If you multiply by an even number what will the lowest bit of the hashCode be?" How random/useful is it? Can you elaborate on it. Thanks in advance :)
Even if we have multiple fields, Why multiplying by a prime is a good way of merging multiple fields into one hashCode? Why not only empId? i think answer lies in its uniqueness is not important. I didn't get what does it mean? As per undersranding uniqueness is important so that its get stored under separate bucket for unequal objects for hashbased data structures?
If you multiply by an even number, you get an even number so the lowest bit is always 0 and of no value.
You want a randomised hashCode and a prime number is the least likely to get some pattern which means not all values are equal. Image you use 9 and you use a Hashtable which is a multiple of 3 in size. This will result in 2 out of every 3 buckets not being used, or worse possible only one in 9 buckets being used. The smaller the prime factors, the more likely you will have some poorly performing situation
Multiplying one value by a prime which not make it any more random.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.