13

I've implemented a custom IEqualityComparer for EventLogEntry.

public class EventLogEntryListComparison :
    IEqualityComparer<List<EventLogEntry>>,
    IEqualityComparer<EventLogEntry>

For IEqualityComparer<List<EventLogEntry>>, the GetHashCode function is very simple.

public int GetHashCode(List<EventLogEntry> obj)
{
    return obj.Sum(entry => 23 * GetHashCode(entry));
}

However, this throws an OverflowException for certain entries.

"Arithmetic operation resulted in an overflow."
   at System.Linq.Enumerable.Sum(IEnumerable`1 source)
   at System.Linq.Enumerable.Sum[TSource](IEnumerable`1 source, Func`2 selector)
   at <snip>.Diagnostics.EventLogAnalysis.EventLogEntryListComparison.GetHashCode(List`1 obj) in C:\dev\<snip>Diagnostics.EventLogAnalysis\EventLogEntryListComparison.cs:line 112
   at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)
   at System.Collections.Generic.Dictionary`2.set_Item(TKey key, TValue value)
   at <snip>.Diagnostics.EventLogAnalysis.Program.AnalyseMachine(String validMachineName) in C:\dev\<snip>.Diagnostics.EventLogAnalysis\Program.cs:line 104
   at System.Threading.Tasks.Parallel.<>c__DisplayClass2d`2.<ForEachWorker>b__23(Int32 i)
   at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()

After trying to get the same error whilst debugging and couldn't in the immediate window, I changed the code to this and bye bye OverflowException?

int total = 0;
foreach (var eventLogEntry in obj)
{
    total += GetHashCode(eventLogEntry);
}

return total;

How is it that LINQ's Sum function behaving differently?

Edit 2

Thanks to a few comments, the corrected and intended GetHashCode function is now as follows,

public int GetHashCode(List<EventLogEntry> obj)
{
    return unchecked(obj.Aggregate(17,
        (accumulate, entry) =>
        accumulate * 23 + GetHashCode(entry)));
}
4
  • 3
    As a side note, there is little benefit in multiplying each element with a prime and then simply adding them. You should multiple the entire hash (calculated so far) on each iteration and then add the next item's hash (I presume you have chosen 23 as your prime based on this answer, or a similar one). Commented Jun 14, 2012 at 13:21
  • 1
    Note that your edited version is (essentially) multiplying the sub-hashcodes together rather than multiplying by a prime and adding them. Maybe you want (accumulate*23) + GetHashCode(entry)? Commented Jun 14, 2012 at 13:46
  • possible duplicate of Enumerable.Sum() overflowing Commented Jun 28, 2015 at 11:26
  • Help me out here - I'd like to learn. Could it not be that the sum of all of the 23 * GetHashCode(entry) (of type int/Int32 ?) in the OP is larger than Int32.MaxValue (the type defined for GetHashCode)? Thus resulting in the overflow? In my case I still got the error when after changing the return type to Int64... I also had to cast the property in the lambda to Int64 (or long). For example: return obj.Sum(entry => 23 * (Int64)GetHashCode(entry)); Commented Jun 3, 2022 at 11:30

3 Answers 3

11

LINQ's Enumerable.Sum(...) methods perform the additions inside a checked block. This means that they deliberately throw an exception if the sum overflows.

Your sum is not inside a checked block, so whether or not it throws an exception depends on... whether it is called from inside a checked block, or a property on the assembly I believe.

Sign up to request clarification or add additional context in comments.

2 Comments

That's right. As a side note: You could use Aggregate instead of Sum: x.Aggregate((a, b) => a + b); (but probably you don't want).
@sloth I would just say that that Aggregate call will throw an exception if the list is empty, but if you provide a seed value of 0 it will work just fine: x.Aggregate(0, (sum, i) => sum + i)
5

It's because of the different behavior of Assemblies compiled in C# and the implementation of Enumerable.Sum.

If you compile an Assembly in C#, by default all additions are performed in unchecked mode, which is why you don't get an overflow in your last example. If you want the runtime to throw on overflows, you need to use checked blocks (of course for your hash, you don't want that, so C#'s default behavior is fine).

By contrast, Enumerable.Sum is meant to calculate a sum, and usually, you don't want sums to overflow. That's why Enumerable.Sum performs its calculations in checked mode, which throws an exception if the sum overflows.

Comments

1

if you're computing a Hash Code, you may not want to use Sum anyways. Using Xor (^) will provide the same results and may even spread your hash codes out more than a sum will. Try this method:

public int GetHashCode(List<EventLogEntry> obj)
{
    int total = 0;
    foreach (var eventLogEntry in obj)
    {
        total ^= GetHashCode(eventLogEntry);
    }

    return total;
}

3 Comments

Maybe not be relevant for this specific case, but this will lead to e.g. { entryA, entryB } and { entryB, entryA } having the same hash code.
@dstanley I strongly disagree with that, read more here, stackoverflow.com/questions/263400/…
... Yup, that's a good point. A proper, multiply-by-a-prime-every-time sum would avoid this, but you can't do that with a LINQ Sum.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.