Use Objects.hash() or own hashCode() implementation?

Question

I have recently discovered the Objects.hash() method.

My first thought was, that this tidies up your hashCode() implementation a lot. See the following example:

@Override
//traditional
public int hashCode() {
    int hash = 5;
    hash = 67 * hash + (int)(this.id ^ (this.id >>> 32));
    hash = 67 * hash + (int)(this.timestamp ^ (this.timestamp >>> 32));
    hash = 67 * hash + Objects.hashCode(this.severity);
    hash = 67 * hash + Objects.hashCode(this.thread);
    hash = 67 * hash + Objects.hashCode(this.classPath);
    hash = 67 * hash + Objects.hashCode(this.message);
    return hash;
}

@Override
//lazy
public int hashCode() {
    return Objects.hash(id, timestamp, severity, thread, classPath, message);
}

Although I have to say that this seems too good to be true. Also I've never seen this usage.

Are there any downsides of using Objects.hash() compared to implementing your own hash code? When would I choose each of those approaches?

Update

Although this topic is marked as resolved, feel free to keep posting answers that provide new information and concerns.

Also see HashCodeBulider: commons.apache.org/proper/commons-lang/apidocs/org/apache/… — NPE
– NPE, Commented Aug 23, 2017 at 6:49
But the commons builder uses reflection. It is convenient but an absolute performance killer. — GhostCat
– GhostCat, Commented Aug 23, 2017 at 6:58
@NPE I really want to keep it with the natives. I'm not a big fan of the whole external apache common stuff — Herr Derb
– Herr Derb, Commented Aug 23, 2017 at 7:00
@MartinSchröder Thanks but I d'like to keep my dependencies clean. — Herr Derb
– Herr Derb, Commented Aug 30, 2017 at 5:49

Andy Turner · Accepted Answer · 2017-08-23 06:56:42Z

71

Note that the parameter of Objects.hash is Object.... This has two main consequences:

Primitive values used in the hash code calculation have to be boxed, e.g. this.id is converted from long to Long.
An Object[] has to be created to invoke the method.

The cost of creating of these "unnecessary" objects may add up if hashCode is called frequently.

edited Aug 23, 2017 at 6:56

answered Aug 23, 2017 at 6:50

Andy Turner

141k11 gold badges169 silver badges263 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

candied_orange Over a year ago

As with all performance concerns, you should prove to yourself that you care before you care.

Andy Turner Over a year ago

@CandiedOrange yes; on the other hand, don't write code you might have to care about. I mean, you could implement hashCode with return 0; if you're not going to care about performance until you have proven otherwise. Since you only have to write it once, and likely your IDE is going to generate the code for you, you may as well opt for the verbose-but-performant version. But I wouldn't rush to change either form to the other in existing code.

nagendra547 · Accepted Answer · 2017-08-23 07:03:54Z

30

Following is implementation of Objects.hash - which is calling Arrays.hashCode internally.

public static int hash(Object... values) {
    return Arrays.hashCode(values);
}

This is implementation of Arrays.hashCode method

public static int hashCode(Object a[]) {
    if (a == null)
        return 0;

    int result = 1;

    for (Object element : a)
        result = 31 * result + (element == null ? 0 : element.hashCode());

    return result;
}

So I agree with @Andy The cost of creating of these "unnecessary" objects may add up if hashCode is called frequently. If you are implementing yourself it would be faster.

answered Aug 23, 2017 at 7:03

nagendra547

6,4403 gold badges34 silver badges48 bronze badges

Comments

Tezra · Accepted Answer · 2017-08-25 13:15:09Z

15

I would like to try and make a strong argument for both.

Opening disclaimer

For this answer, Objects.hash(), Objects.hashCode(), and any function provided by any library that performs this role are interchangeable. First, I would like to argue, use Objects.hash() or don't use the static Objects functions at all. Any argument for or against this method requires making assumptions about the compiled code that are not guaranteed to be true. (For example, the compiler optimizer may convert the function call into an inline call, thus bypassing the extra call stack and object allocation. Just like how loops that do nothing useful don't make it to the compiled version (unless you turn off the optimizer). You also have no guarantee that future Java versions won't include the JVM version like C# does in it's version of this method. (for security reasons I believe)) So the only safe argument you can make regarding using this function, is that it is generally safer to leave the details of a proper hash to this function than to try to implement your own naive version.

For Objects.hash

Guaranteed to be a good hash.
Takes 5 seconds to implement.
Yours would have had a bug in it (somehow especially if you copy-pasted the implementation)

Against Objects.hash

The Java docs make no promises about hash cross-compatibility (Will a JVM v6 and JVM v8 give the same values? always? across OS?)
The thing about hashCodes, They work best if "evenly distributed". So if an int value is only valid for range 1 to 100, you might want to "redistribute" its hash-codes to not all be part of the same bucket.
If you have any requirement that makes you question how Objects.hash works, reliability/performance wise, Think carefully if the hash-code is really what you want, and implement a custom hash-coding method that addresses your needs.

edited Aug 25, 2017 at 13:15

answered Aug 23, 2017 at 22:02

Tezra

8,8734 gold badges37 silver badges72 bronze badges

4 Comments

Andy Turner Over a year ago

You are referring to Objects.hashCode here: do you mean to do so, or do you mean Objects.hash (which is what OP is asking about)?

Tezra Over a year ago

@AndyTurner As far as I'm really concerned in this answer, any library provided hashing function is interchangeable and doesn't really affect the point. If you really care about the difference between how hash and hashCode work, you probably shouldn't be using either of them.

Andy Turner Over a year ago

I think you've missed the point of the question, which is asking about the advantages/disadvantages of Objects.hash to calculate a hash code from multiple fields, vs doing it "by hand". Objects.hashCode simply invokes Object.hashCode() on its parameter, handling the case of null. If you use this, you've still got to combine the hash codes somehow. The two are not interchangable.

Tezra Over a year ago

@AndyTurner They are not interchangeable in the sense one is "hash this" and the other is "hash these". I view them as interchangeable in the sense that you are asking an API to handle the hashing logic for you. In my answer, I try to say "If you aren't using Objects.hash, you probably shouldn't be using Objects.hashCode either. It really does only boil down to "Why is this available function insufficient, and what do I need to do to make a sufficient version?"

Ihor M. · Accepted Answer · 2018-11-24 18:09:45Z

9

Joshua Bloch in his book Effective Java, 3rd edition, p. 53 discourages usage of Objects.hash(...) if performance is critical.

Primitives are being autoboxed and there is a penalty of creating an Object array.

answered Nov 24, 2018 at 18:09

Ihor M.

3,2187 gold badges54 silver badges87 bronze badges

Comments

Luke Usherwood · Accepted Answer · 2021-11-17 20:17:34Z

2

Personally I'd side with the short code first, because it is so much quicker to read, change and verify as correct, which all serve to avoid bugs when modifying the class.

Then for performance-critical classes, or where fields are costly to hash, another available "tool" is to cache the result (like String does):

// volatile not required for 32-bit sized primitives
private int hash;

@Override
public final int hashCode() {
    // "Racy Single-check idiom" (Item 71, Effective Java 2nd ed.)
    int h = hash;
    if (h == 0) {
        h = Objects.hash(id, timestamp, severity, thread, classPath, message);
        hash = h;
    }
    return h;
}

In this lockless-threadsafe pattern (which assumes an immutable class, naturally) there's a slim chance that hash might get initialised by different threads multiple times, but this does not matter because the result of the public method is always identical. The key to (memory-visibility) correctness is ensuring that hash is never written and read more than once in the method.

The array-construction penalty of Objects.hash might be lower than you imagine once the code gets JIT-inlined & optimised by the C2 compiler.

(UPDATED) The good folks over at JDK had something baking that was pitched at eliminating the overheads of Objects::hash mentioned in other answers, but apparently the difference was not enough to matter, to that particular method: JEP 348: Java Compiler Intrinsics for JDK APIs

edited Nov 17, 2021 at 20:17

answered Mar 1, 2019 at 9:46

Luke Usherwood

3,1401 gold badge32 silver badges38 bronze badges

4 Comments

Farid Over a year ago

Nice implementation to avoid additional calculations but what if Object is mutable and one of the parameters used for hash calculations changes?

Luke Usherwood Over a year ago

Sure, then don't cache it, but then: also think twice before using any mutable object as a key in a hash-based map/set. Doing so can break the whole collection - objects might go "missing" inside it, etc.

Michel Charpentier Over a year ago

If h is 0, you'll end up not caching it. String uses an additional hashIsZero field to deal with that case (which makes the JMM issues even more subtle).

Luke Usherwood Over a year ago

Indeed, it is important to ensure h is not 0 for common cases. Simply adding a constant is usually sufficient. The hash fn should then ensure the frequency of 0 occuring in non-corner cases is exceedingly low and thus averages out to ~ nothing. I think it'd be rare to ever need this significant extra complexity in most real-world cases. String does need to cater for very broad scenarios - but even there it got away without this mini-optimisation just fine up until 2019, despite its ubiquity.

Collectives™ on Stack Overflow

Use Objects.hash() or own hashCode() implementation?

5 Answers 5

2 Comments

Comments

Opening disclaimer

For Objects.hash

Against Objects.hash

4 Comments

Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

Comments

Opening disclaimer

For Objects.hash

Against Objects.hash

4 Comments

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related