5

I have a Boolean string (like "01100..001") of length 128 characters (means 128 number of 0/1). I am searching for an efficient (fast) hash function in Java, which produce a much lower representation than 128 bit and obviously with less collision. Can anybody help me, is there any such hash function ? Any suggestion ?

3
  • 3
    Less collision than the zero you'd get with a 128-bit representation? Commented Apr 22, 2012 at 17:15
  • @eggyal, Thanks a lot. Nice concept. It will help me lot. :) Commented Apr 22, 2012 at 17:27
  • Using a String for just storing a 128-bit value seems to me a bit of overkill, a waste of memory and - especially if you care on performance - definitely not the best option. Commented Apr 22, 2012 at 17:41

3 Answers 3

8

Have you considered using a java.util.BitSet instead, depending on what you are doing it could be a lot easier and more efficient? http://docs.oracle.com/javase/6/docs/api/java/util/BitSet.html It has a .hashCode() method as well.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot. It will be nice one to try. However, is there any document stating that probability of collision ?
Not that I'm aware of. I know it was improved back in 2004 (see bug parade: bugs.sun.com/bugdatabase/view_bug.do?bug_id=4979028) and the java doc show (defines?) how the hashcode is calculated: The source is available as well of course. docs.oracle.com/javase/6/docs/api/java/util/…
5

Try using the .hashCode() method on Java String class, it returns an int and it is very fast.

Or you can use the .hashCode() method on java.util.BitSet as Pulsar suggest, if you prefer store your data in BitSet.

5 Comments

I was going to say that except that I would've converted the String to BigInteger first and called the .hashCode() method for that. But I'm guessing it's faster to just hash the original String like you suggested. Just wondering why you'd want to have 16 bytes stored as a 128 byte String, that seems like a huge waste of space.
Thanks a lot. It will be nice one to try. However, is there any document stating the probability of collision ?
@ZeroOne, I am also thinking to convert it to BigInt and then call hashcode. Because, I think that will create less collisions.
Try to read this answer about collisions: stackoverflow.com/questions/299304/…
Using the hashCode() of BitSet as per @Pulsar's answer is probably better.
1

If you need to calculate the hash of a string, simply use the hashCode() method of the String class. Depending on the implementation, several optimizations are made for quickly computing this value.

As an example, in OpenJDK's implementation of the String class the hashCode() method caches the value in the hash attribute and only needs to be computed once.

And who said that a string of 128 characters has a hash of 128-bits? all hashes returned by the hashCode() method in Java are of type int, and ints in Java are represented using 32-bits.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.