1

I made this function and it works the same as Original Java function when you input something short, but if i input something larger than 5-7 characters - then I get some realy big number. (And not the right hashcode)

Here is the formula of Java's hash function:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Simplier one (Only works for short strings):

s = "abc" //String
n = 3 //Lenght of the String
s[0] = 'a'. ASCII code of 'a' = 97.
97 * (31 ^ (n - 1))
97 * (31 ^ (2))
97 * 961 = 93217

s[1] = 'b'. ASCII code of 'b' = 98.
98 * (31 ^ (n - 2))
98 * (31 ^ 1)
98 * 31 = 3038

s[2] = 'c'. ASCII code of 'c' = 99.
99 * (31 ^ (n - 3))
99 * (31 ^ 0)
99 * 1 = 99

93217 + 3038 + 99 = 96354 //

I want to know how does Java makes hash small even when I enter a huge string.

Java's hashcode of "Hello" - 69609650
My hashcode of "Hello" - 69609650


Java's hashcode of "Welcome to Tutorialspoint.com" - 1186874997
My hashcode of "Welcome to Tutorialspoint.com" - 5.17809991536626e+43

Also how can hash be negative if we add up numbers ?

0

3 Answers 3

3

I suspect your implementation (which you haven't shown) uses BigInteger or something similar. Java just uses int - so when it overflows the range of positive 31-bit integers, it goes into large negative integers, and then as you add more (positive) values, you'll end up with small negative integers, then small positive integers, then large positive integers - and back to large negative integers.

Sign up to request clarification or add additional context in comments.

Comments

2

String's hashCode involves only int addition and multiplication, so it results in an int, which may overflow (hence the negative values).

public int hashCode() {
    int h = hash;
    int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;
        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

Based on your 5.17809991536626e+43 value, it looks like you are doing floating point calculations (perhaps you are using Math.pow() which returns a double), which give different results for large numbers.

Comments

1

Source code for String$hashCode():

 1494       public int hashCode() {
 1495           int h = hash;
 1496           if (h == 0 && count > 0) {
 1497               int off = offset;
 1498               char val[] = value;
 1499               int len = count;
 1500   
 1501               for (int i = 0; i < len; i++) {
 1502                   h = 31*h + val[off++];
 1503               }
 1504               hash = h;
 1505           }
 1506           return h;
 1507       }

int is a signed integer on 4 bytes and it will just overflow during the hash computation, yielding a value that can be negative but is always bound by int.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.