GetHashCode() on byte[] array

Question

What does GetHashCode() calculate when invoked on the byte[] array? The 2 data arrays with equal content do not provide the same hash.

FYI: If you are on .NET4 ((IStructuralEquatable) myArray).GetHashCode(EqualityComparer<object>.Default) should give the same result for two arrays with same content. msdn.microsoft.com/en-us/library/…. — Just another metaprogrammer
– Just another metaprogrammer, Commented Aug 30, 2011 at 15:25
Related: why-does-.NET-not-implement-gethashcode-for-collections — nawfal
– nawfal, Commented Aug 9, 2014 at 11:32
@FuleSnabel Note that IStructuralEquatable requires use of the non-generic IEqualityComparer interface, which means that each byte in the array is going to be boxed during computation of the hashcode. — cdhowie
– cdhowie, Commented Nov 13, 2015 at 17:32

Jon Skeet · Accepted Answer · 2011-08-30 20:11:31Z

80

Arrays in .NET don't override Equals or GetHashCode, so the value you'll get is basically based on reference equality (i.e. the default implementation in Object) - for value equality you'll need to roll your own code (or find some from a third party). You may want to implement IEqualityComparer<byte[]> if you're trying to use byte arrays as keys in a dictionary etc.

EDIT: Here's a reusable array equality comparer which should be fine so long as the array element handles equality appropriately. Note that you mustn't mutate the array after using it as a key in a dictionary, otherwise you won't be able to find it again - even with the same reference.

using System;
using System.Collections.Generic;

public sealed class ArrayEqualityComparer<T> : IEqualityComparer<T[]>
{
    // You could make this a per-instance field with a constructor parameter
    private static readonly EqualityComparer<T> elementComparer
        = EqualityComparer<T>.Default;

    public bool Equals(T[] first, T[] second)
    {
        if (first == second)
        {
            return true;
        }
        if (first == null || second == null)
        {
            return false;
        }
        if (first.Length != second.Length)
        {
            return false;
        }
        for (int i = 0; i < first.Length; i++)
        {
            if (!elementComparer.Equals(first[i], second[i]))
            {
                return false;
            }
        }
        return true;
    }

    public int GetHashCode(T[] array)
    {
        unchecked
        {
            if (array == null)
            {
                return 0;
            }
            int hash = 17;
            foreach (T element in array)
            {
                hash = hash * 31 + elementComparer.GetHashCode(element);
            }
            return hash;
        }
    }
}

class Test
{
    static void Main()
    {
        byte[] x = { 1, 2, 3 };
        byte[] y = { 1, 2, 3 };
        byte[] z = { 4, 5, 6 };

        var comparer = new ArrayEqualityComparer<byte>();

        Console.WriteLine(comparer.GetHashCode(x));
        Console.WriteLine(comparer.GetHashCode(y));
        Console.WriteLine(comparer.GetHashCode(z));
        Console.WriteLine(comparer.Equals(x, y));
        Console.WriteLine(comparer.Equals(x, z));
    }
}

edited Aug 30, 2011 at 20:11

answered Aug 30, 2011 at 14:22

Jon Skeet

1.5m893 gold badges9.3k silver badges9.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Jon Skeet Over a year ago

@Chesnokov Yuriy: Okay, I've edited some code into my answer.

Douglas Over a year ago

There seems to be some debate on whether GetHashCode should scan over the entire sequence. Interestingly, the internal implementation for Array.IStructuralEquatable.GetHashCode only considers the last eight items of an array, sacrificing hash uniqueness for speed.

Peter - Reinstate Monica Over a year ago

I did something similar using Enumerable.SequenceEqual(). Is there a particular reason to hand-code the element comparison? (Admittedly it's probably a bit faster.)

Jon Skeet Over a year ago

@PeterA.Schneider: I don't think SequenceEqual is optimized to compare lengths first if the source implements appropriate interfaces.

bitbonk Over a year ago

@JonSkeet Since we have new primitives like Memory<T>, Span<T> or Sequence<T> can this code be optimised in any way? For example we do have SequenceEqual for ReadOnlySpan<T> now.

|

Community · Accepted Answer · 2017-05-23 12:10:04Z

23

Like other non-primitive built-in types, it just returns something arbitrary. It definitely doesn't try to hash the contents of the array. See this answer.

edited May 23, 2017 at 12:10

CommunityBot

11 silver badge

answered Aug 30, 2011 at 14:22

mqp

72.3k15 gold badges99 silver badges124 bronze badges

Comments

Petter Hesselberg · Accepted Answer · 2023-12-15 11:06:20Z

16

Simple solution

public static int GetHashFromBytes(byte[] bytes)
{
    return new BigInteger(bytes).GetHashCode();
}

edited Dec 15, 2023 at 11:06

Petter Hesselberg

5,5883 gold badges29 silver badges51 bronze badges

answered Nov 15, 2018 at 9:58

Daniil Sokolyuk

5041 gold badge6 silver badges11 bronze badges

6 Comments

Guy Langston Over a year ago

Seeing this solution made me smile. Clean, elegant. Digging deeper the hash implementation ends up calling github.com/microsoft/referencesource/blob/master/…

Dave Jellison Over a year ago

@XeorgeXeorge so?

fjch1997 Over a year ago

@DaveJellison There is a (2^32) in 1 chance of collision, which is negalegible for most scenarios but is something that must be kept in mind whenever there's a hash code.

Dave Jellison Over a year ago

Agreed, but this is inherent with hashing as a rule. It's like going to the dictionary.com to complain about the definition of a word.

Steve Pick Over a year ago

Note this method incurs a copy of the whole byte array, so may not be efficient. Also It's important to understand the purpose of GetHashCode() - it's not intended to produce a unique value but rather a well-distributed value for allocating buckets in a Dictionary or HashSet, which benefit from each bucket being roughly equal size. Both types use a combination of GetHashCode() and Equals() to determine whether a collision has really occurred.

|

rickythefox · Accepted Answer · 2011-08-30 14:23:41Z

13

byte[] inherits GetHashCode() from object, it doesn't override it. So what you get is basically object's implementation.

answered Aug 30, 2011 at 14:23

rickythefox

6,8697 gold badges44 silver badges65 bronze badges

Comments

Kay Zed · Accepted Answer · 2024-08-05 14:16:36Z

If you are using .NET 6 or at least .NET Core 2.1, you can write less code and achieve better performance with the System.HashCode struct.

Using the method HashCode.AddBytes() which is available from .NET 6:

public int GetHashCode(byte[] value)
{
    var hash = new HashCode();
    hash.AddBytes(value);
    return hash.ToHashCode();
}

Using the method HashCode.Add which is available from .NET Core 2.1:

public int GetHashCode(byte[] value) =>
    value.Aggregate(new HashCode(), (hash, i) => {
        hash.Add(i);
        return hash;
    }).ToHashCode();

Note that in the documentation of HashCode.AddBytes() it says:

This method does not guarantee that the result of adding a span of bytes will match the result of adding the same bytes individually.

In this sharplab demo both output the same result, but this might vary with .NET version or runtime environment.

jishi · Accepted Answer · 2011-08-30 14:23:44Z

0

If it's not the same instance, it will return different hashes. I'm guessing it is based on the memory address where it is stored somehow.

answered Aug 30, 2011 at 14:23

jishi

24.7k6 gold badges52 silver badges78 bronze badges

1 Comment

Chesnokov Yuriy Over a year ago

no, it is not the same instance, I presume in that case hashes would be equal

Collectives™ on Stack Overflow

GetHashCode() on byte[] array

6 Answers 6

6 Comments

Comments

6 Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

6 Comments

Comments

6 Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related