Why and how are Python functions hashable?

Question

I recently tried the following commands in Python:

>>> {lambda x: 1: 'a'}
{<function __main__.<lambda>>: 'a'}

>>> def p(x): return 1
>>> {p: 'a'}
{<function __main__.p>: 'a'}

The success of both dict creations indicates that both lambda and regular functions are hashable. (Something like {[]: 'a'} fails with TypeError: unhashable type: 'list').

The hash is apparently not necessarily the ID of the function:

>>> m = lambda x: 1
>>> id(m)
140643045241584
>>> hash(m)
8790190327599
>>> m.__hash__()
8790190327599

The last command shows that the __hash__ method is explicitly defined for lambdas, i.e., this is not some automagical thing Python computes based on the type.

What is the motivation behind making functions hashable? For a bonus, what is the hash of a function?

I really feel like this is the sort of question that you shouldn't consider before you have a good answer to "why wouldn't you?" — user1084944
– user1084944, Commented Jul 22, 2016 at 11:24
@Hurkyl. Because it requires the addition of extra maintenance burden for one thing. Someone had to design and write the __hash__ function, so they clearly saw benefit in it. I would like to know what made them not just leave it alone. — Mad Physicist
– Mad Physicist, Commented Jul 22, 2016 at 13:21
Although given the answers, it seems that disabling the hash function would require more work than just inheriting it from object. So in fact one of the considerations may have been reduction of maintenance debt. — Mad Physicist
– Mad Physicist, Commented Jul 22, 2016 at 16:16
@MadPhysicist, the code is trivial either way. In C, an object's type is represented by a struct PyTypeObject, and that struct's tp_hash member defines what happens for __hash__(). If that member is initialized to 0, it inherits __hash__(). If you want the type to refuse to hash, you just initialize it to PyObject_HashNotImplemented instead. (And if the type wants to implement a custom hash, you plug in the address of the type's custom hash function.) So maintenance burden really has nothing to do with it. — Tim Peters
– Tim Peters, Commented Jul 22, 2016 at 17:41

user2357112 · Accepted Answer · 2017-11-03 01:55:48Z

63

It's nothing special. As you can see if you examine the unbound __hash__ method of the function type:

>>> def f(): pass
...
>>> type(f).__hash__
<slot wrapper '__hash__' of 'object' objects>

the of 'object' objects part means it just inherits the default identity-based __hash__ from object. Function == and hash work by identity. The difference between id and hash is normal for any type that inherits object.__hash__:

>>> x = object()
>>> id(x)
40145072L
>>> hash(x)
2509067

You might think __hash__ is only supposed to be defined for immutable objects, and you'd be almost right, but that's missing a key detail. __hash__ should only be defined for objects where everything involved in == comparisons is immutable. For objects whose == is based on identity, it's completely standard to base hash on identity as well, since even if the objects are mutable, they can't possibly be mutable in a way that would change their identity. Files, modules, and other mutable objects with identity-based == all behave this way.

edited Nov 3, 2017 at 1:55

answered Jul 22, 2016 at 5:34

user2357112

286k32 gold badges490 silver badges571 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Mad Physicist Over a year ago

So by default, all instances in Python are hashable unless it is turned off somewhere in the class hierarchy? That is interesting news indeed, although intuitively it makes sense based on how few types you cant't use to key a dict.

Delioth Over a year ago

== does not work on identity. is is the identity operator.

user2357112 Over a year ago

@Delioth: == works by identity for any type that doesn't specify some other behavior.

Bakuriu Over a year ago

@MadPhysicist The proper way to tell python that a class is not hashable is by setting __hash__ to None: class X: __hash__ = None;{X(): 1} -> TypeError: unhashable type: 'X'

Mad Physicist Over a year ago

@Bakuriu. Normally I would agree with you. Have you tried doing this for a function? See the comments below stackoverflow.com/a/38518991/2988730.

|

Tim Peters · Accepted Answer · 2016-07-22 05:58:09Z

26

It can be useful, e.g., to create sets of function objects, or to index a dict by functions. Immutable objects normally support __hash__. In any case, there's no internal difference between a function defined by a def or by a lambda - that's purely syntactic.

The algorithm used depends on the version of Python. It looks like you're using a recent version of Python on a 64-bit box. In that case, the hash of a function object is the right rotation of its id() by 4 bits, with the result viewed as a signed 64-bit integer. The right shift is done because object addresses (id() results) are typically aligned so that their last 3 or 4 bits are always 0, and that's a mildly annoying property for a hash function.

In your specific example,

>>> i = 140643045241584 # your id() result
>>> (i >> 4) | ((i << 60) & 0xffffffffffffffff) # rotate right 4 bits
8790190327599  # == your hash() result

answered Jul 22, 2016 at 5:58

Tim Peters

71.4k14 gold badges133 silver badges140 bronze badges

3 Comments

Tim Peters Over a year ago

@JulienBernu, see @user2357112's response - it's immutable so far as == is concerned, which is all __hash__() cares about.

wilson0x4d Over a year ago

One of the biggest (only?) mistakes of the __hash__ spec is "The only required property is that objects which compare equal have the same hash value" -- this is a concept borne from inexperience which will plague python until the end of time. The hash of an object for the purpose of identity should never depend on object equality and should not mutate over time. developers would be wise to not implement their hash functions in a way that an object's state determines hash result. Python has made a lot of great and tough decisions over the years, this was not one of them. This was a mistake.

Tim Peters Over a year ago

@wilson0x4d, sorry, I have no idea what you're talking about. Open a new issue if you want to pursue it - comments are not intended to be discussion forums on this site. Especially not when you add a comment after a lag of over 8 year ;-)

Community · Accepted Answer · 2020-06-20 09:12:55Z

4

A function is hashable because it is a normal, builtin, non mutable object.

From the Python Manual:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() or __cmp__() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal (except with themselves), and their hash value is derived from their id().

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Jul 22, 2016 at 5:42

Julien

15.3k6 gold badges33 silver badges58 bronze badges

8 Comments

user2357112 Over a year ago

They're actually a lot more mutable than you might expect. You can add attributes to them, reassign their __name__, __module__, and __doc__, and even reassign their __code__ if you're feeling crazy.

Mad Physicist Over a year ago

You can also reassign their __hash__ function :)

user2357112 Over a year ago

@MadPhysicist: Though reassigning f.__hash__ won't actually do anything to hash(f).

Mad Physicist Over a year ago

@user2357112 Interesting. I just confirmed. Why is that?

Bakuriu Over a year ago

@MadPhysicist You just can't. The Python interpreter does support true readonly attributes, but these facilities can only be accessed via the C API. So from pure python code it's impossible to change those values (thank god!).

|

Collectives™ on Stack Overflow

Why and how are Python functions hashable?

3 Answers 3

7 Comments

3 Comments

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

3 Comments

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related