0

So I have some code that is failing because two string literals have HashCodes that can evaluate to the same thing. And while I appreciate that collision can happen I wasn't quite expecting this. But whilst debugging an issue in code a colleague and I found that in an immediate window if we did

"55d02ProductAd".GetHashCode() == "55b0tProductAd".GetHashCode()

It would evalute to true. Not ideal, but not impossible. When we described this to another colleague, in his disbelief he wrote up a scratch program that did

var h1 = "55d02ProductAd".GetHashCode();
var h2 = "55b0tProductAd".GetHashCode();
Console.WriteLine(h1 == h2);

In the above, they do not evaluate to the same thing. We have our monitors next to each other and we are confused as to the different outputs. Any thoughts?

6
  • 2
    msdn.microsoft.com/en-us/library/… You should never persist or use a hash code outside the application domain in which it was created, because the same object may hash across application domains, processes, and platforms. Commented Sep 2, 2016 at 18:44
  • they do have the same hash code at least in .net 4.5 dotnetfiddle.net/O3VUtX Commented Sep 2, 2016 at 18:48
  • 1. Do both on the same machine - different machines may produce different hashes. 2. The both equate to true on my machine.3. Dont ever use hash codes for equality. Hash codes are not unique. Commented Sep 2, 2016 at 18:48
  • 2
    @rick Even on the same machines it's allowed to produce different hashes. Commented Sep 2, 2016 at 18:50
  • Related: referencesource.microsoft.com/#mscorlib/system/string.cs,838 Commented Sep 2, 2016 at 18:53

1 Answer 1

4

Hash codes are only contractually obligated to produce the same hash for a value within the context of a single application's execution. Since you're comparing the values of GetHashCode form entirely different applications, there is no obligation for them to be equal.

Sign up to request clarification or add additional context in comments.

5 Comments

He is not saying that he is comparing values from different applications. He says that one application produces different values.
@Stilgar No, he's not. He's saying he had an application, computed these hashes, then a coworker created a new application, hashed the same strings, and got different values. That's an entirely different application. And of course, even if it was the same application, if it was a different invocation of it, the hash would be allowed to vary. If you just print the hash of "Hello World!" and run it over and over it's allowed to give a different value.
Also, the birthday problem makes collisions reasonably common. Many structures that rely on hash codes (like Dictionary<TK, TV>) handle collisions elegantly.
@Haney Well, any that don't are buggy. Every sensible implementation needs to handle collisions.
@Haney sensible handling is not using hashes to determine equality - but rather inequality. It's really an easy way to exclude things from matching algorithms, within the same application domain.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.