0

I have a simple struct consists of a fixed size string and an integer. I need to use this struct as the key for a hash table. I have a hash function for sting, Hs(string), and a hash function for integer, Hi(int), I'm wondering if the hash function for this simple struct would just be H(struct) = Hs(string) + Hi(int)? Alternatively, I could encode the integer into a string and append it to the string, then just use the string hash function. Any suggestions? Thanks.

3
  • 1
    What programming language? Commented Sep 26, 2013 at 3:01
  • c,c++. But I would be surprised if the answer is language specific. Commented Sep 26, 2013 at 3:13
  • some programing languages provide runtime support for hashing of classes/structs derived from built-ins (e.g: C#) Commented Sep 26, 2013 at 3:15

2 Answers 2

1

In order to figure the "how" we need to answer a few questions first:
a) how many items you should be able to accommodate ?
b) what's the size of the hash table ?
c) how many combinations of <string X int> are there (since the string has a fixed size it's easy to calculate) ?

Once you figure that out - it will be easier to find a hashing function that will minimize collisions, for example, there might be a case where using Hs(string) is good enough!

Sign up to request clarification or add additional context in comments.

4 Comments

the string portion is something like uuid, evenly distributed, the integer is something like the elapsed seconds since an early starting time. I would say the number of entries in the table could reach a few millions and more.
According to what I read, a loading factor of 75% is considered good. If the UUID provides a good distribution then I don't see any advantage of using the int in the hashing algorithm.
There are entries with the same uuid but different integer values.
@atsu right, and you can use it to identify in which bucket the <string X int> is (when there is a collision).
0

Either of those would work. My default would be Hs(string) XOR Hs(int) but plus is fine too. It just needs to probably not collide and both of those will probably not collide although Hs(string) XOR Hs(int) or Hs(string) + Hs(int) will be faster.

3 Comments

int = 32 byte (at least in Java it is) while String is (theoretically) limited only by the JVM - how do you XOR two items of different sizes ?
You hash them both first. You now have something fixed size of the correct length.
@FrankieRobertson without knowing some details of the platform/hash implementations, you might introduce a lot of bias (collisions) by simply XORing the two values.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.