How to implement an efficient bidirectional hash table?

Question

Python dict is a very useful data-structure:

d = {'a': 1, 'b': 2}

d['a'] # get 1

Sometimes you'd also like to index by values.

d[1] # get 'a'

Which is the most efficient way to implement this data-structure? Any official recommend way to do it?

If you prefer, we can assume that values are immutable as well as keys are. — Juanjo Conti
– Juanjo Conti, Commented Jul 23, 2010 at 13:39
What would you return for this dict: {'a' : 1, 'b': 2, 'A' : 1 } — PaulMcG
– PaulMcG, Commented Jul 23, 2010 at 16:49
@PaulMcGuire : I would return {1: ['a', 'A'], 2: 'b'}. See my answer for such a way to do it. — Basj
– Basj, Commented Feb 19, 2014 at 22:41
Note to moderator: this is not a duplicate of stackoverflow.com/questions/1456373/two-way-reverse-map. The latter has 1) very vague wording 2) no MCVE 3) only deals with the case of the bijective map (see first comment in this question), which is a lot more restrictive than this actual question, which is more general. So I think marking it as duplicate is here, in this particular case, misleading. If really one should be a duplicate of another, it should be the contrary as this one here covers the general case whereas the other (see answers) doesn't cover the non-bijective case. — Basj
– Basj, Commented Jun 19, 2018 at 9:13
This question is more than ten years old, but I am reading it for the first time now. You might find inspiration in the Java library Google Guava. They have a class HashBiMap that is worth reading. (I assume a similar structure could be implemented in Python.) The documentation clearly outlines edge cases and how they are handled. Ref: github.com/google/guava/blob/master/guava/src/com/google/common/… — kevinarpe
– kevinarpe, Commented Apr 27, 2021 at 5:59

Basj · Accepted Answer · 2022-09-14 08:20:59Z

113

Here is a class for a bidirectional dict, inspired by Finding key from value in Python dictionary and modified to allow the following 2) and 3).

Note that :

1. The inverse directory bd.inverse auto-updates itself when the standard dict bd is modified.
1. The inverse directory bd.inverse[value] is always a list of key such that bd[key] == value.
1. Unlike the bidict module from https://pypi.python.org/pypi/bidict, here we can have 2 keys having same value, this is very important.

Code:

class bidict(dict):
    def __init__(self, *args, **kwargs):
        super(bidict, self).__init__(*args, **kwargs)
        self.inverse = {}
        for key, value in self.items():
            self.inverse.setdefault(value, []).append(key) 

    def __setitem__(self, key, value):
        if key in self:
            self.inverse[self[key]].remove(key) 
        super(bidict, self).__setitem__(key, value)
        self.inverse.setdefault(value, []).append(key)        

    def __delitem__(self, key):
        self.inverse.setdefault(self[key], []).remove(key)
        if self[key] in self.inverse and not self.inverse[self[key]]: 
            del self.inverse[self[key]]
        super(bidict, self).__delitem__(key)

Usage example:

bd = bidict({'a': 1, 'b': 2})  
print(bd)                     # {'a': 1, 'b': 2}                 
print(bd.inverse)             # {1: ['a'], 2: ['b']}
bd['c'] = 1                   # Now two keys have the same value (= 1)
print(bd)                     # {'a': 1, 'c': 1, 'b': 2}
print(bd.inverse)             # {1: ['a', 'c'], 2: ['b']}
del bd['c']
print(bd)                     # {'a': 1, 'b': 2}
print(bd.inverse)             # {1: ['a'], 2: ['b']}
del bd['a']
print(bd)                     # {'b': 2}
print(bd.inverse)             # {2: ['b']}
bd['b'] = 3
print(bd)                     # {'b': 3}
print(bd.inverse)             # {2: [], 3: ['b']}

edited Sep 14, 2022 at 8:20

answered Feb 19, 2014 at 22:34

Basj

47.5k113 gold badges467 silver badges819 bronze badges

Sign up to request clarification or add additional context in comments.

18 Comments

Cecil Curry Over a year ago

This is phenomenal. It's succinct; it's self-documenting; it's reasonably efficient; it just works. My only quibble would be to optimize the repeated lookups of self[key] in __delitem__() with a single value = self[key] assignment reused for such lookups. But... yeah. That's negligible. Thanks for the pure awesome, Basj!

zelusp Over a year ago

How about a Python 3 version?

The Nate Over a year ago

Ah. Right. Try it without "iter" and it should work.

searchengine27 Over a year ago

I like this answer for the example. The accepted answer is correct still and I think the accepted answer should stay as the accepted answer, but this is a little bit more explicit for defining it yourself, merely because it clearly lays out that in order to reverse the dictionary you must place the reverse's values into a list since there cannot be a one-to-one mapping because a dictionary has a one-to-many relationship with key-to-values.

Stephen Eckels Over a year ago

this doesn't support many to many in both directions. bd = bidict({'a': [1,2,3], 'b': [2,3,4]}) errors about unhashable type.

|

Emil · Accepted Answer · 2010-07-23 13:59:45Z

55

You can use the same dict itself by adding key,value pair in reverse order.

d={'a':1,'b':2}
revd=dict([reversed(i) for i in d.items()])
d.update(revd)

answered Jul 23, 2010 at 13:59

Emil

13.8k19 gold badges74 silver badges108 bronze badges

12 Comments

FMc Over a year ago

+1 A nice, practical solution. Another way to write it: d.update( dict((d[k], k) for k in d) ).

Beni Cherniavsky-Paskin Over a year ago

+1 For neat use of reversed(). I'm undecided if it's more readable than the explicit dict((v, k) for (k, v) in d.items()). In any case, you can pass pairs directly to .update: d.update(reversed(i) for i in d.items()).

Tobias Kienzler Over a year ago

Note this fails e.g. for d={'a':1, 'b':2, 1: 'b'}

0xc0de Over a year ago

Slight modification: dict(map(reversed, a_dict.items())).

Cecil Curry Over a year ago

Adding reverse mappings to the original dictionary is an awful idea. As the above comments demonstrate, doing so is not safe in the general case. Just maintain two separate dictionaries. Since the first two lines of this answer ignoring the trailing d.update(revd) are great, however, I'm still contemplating an upvote. Let's give this some thought.

|

mlissner · Accepted Answer · 2016-03-03 20:08:07Z

46

A poor man's bidirectional hash table would be to use just two dictionaries (these are highly tuned datastructures already).

There is also a bidict package on the index:

https://pypi.python.org/pypi/bidict

The source for bidict can be found on github:

https://github.com/jab/bidict

edited Mar 3, 2016 at 20:08

mlissner

18.3k18 gold badges121 silver badges177 bronze badges

answered Jul 23, 2010 at 13:41

miku

189k47 gold badges314 silver badges317 bronze badges

11 Comments

Juanjo Conti Over a year ago

2 dicts requires double inserts and deletes.

Walter Mundt Over a year ago

@Juanjo: nearly any bidirectional/reversible hash table will involve "double inserts and deletes", either as part of implementing the structure, or as part of using it. Keeping two indexes is really the only fast way to do it, AFAIK.

Juanjo Conti Over a year ago

Of course; I meant that take care of the 2 index by hand is the problem.

user193130 Over a year ago

@Basj I think it's correct that it's not accepted since having more than one value means it's not a bijection anymore and is ambiguous for the reverse lookup.

user193130 Over a year ago

@Basj Well, I can understand that there would be use cases which would be useful to have more than one value per key, so maybe this type of data structure should exist as a subclass of bidict. However, since a normal dict maps to a single object, I think it makes much more sense for the reverse to be the same as well. (Just to clarify, although the value can be a collection too, I meant that the key of the first dict should be of the same type as the value of the reverse dict)

|

jme · Accepted Answer · 2015-12-25 05:12:17Z

12

The below snippet of code implements an invertible (bijective) map:

class BijectionError(Exception):
    """Must set a unique value in a BijectiveMap."""

    def __init__(self, value):
        self.value = value
        msg = 'The value "{}" is already in the mapping.'
        super().__init__(msg.format(value))


class BijectiveMap(dict):
    """Invertible map."""

    def __init__(self, inverse=None):
        if inverse is None:
            inverse = self.__class__(inverse=self)
        self.inverse = inverse

    def __setitem__(self, key, value):
        if value in self.inverse:
            raise BijectionError(value)

        self.inverse._set_item(value, key)
        self._set_item(key, value)

    def __delitem__(self, key):
        self.inverse._del_item(self[key])
        self._del_item(key)

    def _del_item(self, key):
        super().__delitem__(key)

    def _set_item(self, key, value):
        super().__setitem__(key, value)

The advantage of this implementation is that the inverse attribute of a BijectiveMap is again a BijectiveMap. Therefore you can do things like:

>>> foo = BijectiveMap()
>>> foo['steve'] = 42
>>> foo.inverse
{42: 'steve'}
>>> foo.inverse.inverse
{'steve': 42}
>>> foo.inverse.inverse is foo
True

answered Dec 25, 2015 at 5:12

jme

20.8k6 gold badges44 silver badges40 bronze badges

1 Comment

Aemyl Over a year ago

This also ensures that modifying foo.inverse doesn't leave foo in an inconsistent state, unlike Basj's answer

Basj · Accepted Answer · 2014-02-19 17:40:24Z

1

Something like this, maybe:

import itertools

class BidirDict(dict):
    def __init__(self, iterable=(), **kwargs):
        self.update(iterable, **kwargs)
    def update(self, iterable=(), **kwargs):
        if hasattr(iterable, 'iteritems'):
            iterable = iterable.iteritems()
        for (key, value) in itertools.chain(iterable, kwargs.iteritems()):
            self[key] = value
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        if value in self:
            del self[value]
        dict.__setitem__(self, key, value)
        dict.__setitem__(self, value, key)
    def __delitem__(self, key):
        value = self[key]
        dict.__delitem__(self, key)
        dict.__delitem__(self, value)
    def __repr__(self):
        return '%s(%s)' % (type(self).__name__, dict.__repr__(self))

You have to decide what you want to happen if more than one key has a given value; the bidirectionality of a given pair could easily be clobbered by some later pair you inserted. I implemented one possible choice.

Example :

bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'})
print bd['myvalue1']   # a
print bd['myvalue2']   # b

edited Feb 19, 2014 at 17:40

Basj

47.5k113 gold badges467 silver badges819 bronze badges

answered Jul 23, 2010 at 13:58

Matt Anderson

19.9k12 gold badges46 silver badges59 bronze badges

3 Comments

tgray Over a year ago

I'm not sure if this is a problem, but using the above implementation, wouldn't there be issues if the keys and values overlapped? So dict([('a', 'b'), ('b', 'c')]); dict['b'] -> 'c' instead of the key 'a'.

tgray Over a year ago

It's not an issue for the OP's example, but might be a good disclaimer to include.

Basj Over a year ago

How can we do that print bd['myvalue2'] answers b, c (or [b, c], or (b, c), or anything else) ?

NeoWang · Accepted Answer · 2015-12-25 07:42:10Z

First, you have to make sure the key to value mapping is one to one, otherwise, it is not possible to build a bidirectional map.

Second, how large is the dataset? If there is not much data, just use 2 separate maps, and update both of them when updating. Or better, use an existing solution like Bidict, which is just a wrapper of 2 dicts, with updating/deletion built in.

But if the dataset is large, and maintaining 2 dicts is not desirable:

If both key and value are numeric, consider the possibility of using Interpolation to approximate the mapping. If the vast majority of the key-value pairs can be covered by the mapping function (and its
reverse function), then you only need to record the outliers in maps.
If most of access is uni-directional (key->value), then it is totally ok to build the reverse map incrementally, to trade time for
space.

Code:

d = {1: "one", 2: "two" }
reverse = {}

def get_key_by_value(v):
    if v not in reverse:
        for _k, _v in d.items():
           if _v == v:
               reverse[_v] = _k
               break
    return reverse[v]

ListenSoftware Louise Ai Agent · Accepted Answer · 2021-10-23 20:33:41Z

1

a better way is convert the dictionary to a list of tuples then sort on a specific tuple field

def convert_to_list(dictionary):
    list_of_tuples = []
    for key, value in dictionary.items():
        list_of_tuples.append((key, value))
    return list_of_tuples

def sort_list(list_of_tuples, field):
     return sorted(list_of_tuples, key=lambda x: x[field])

dictionary = {'a': 9, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
list_of_tuples = convert_to_list(dictionary)
print(sort_list(list_of_tuples, 1))

output

[('b', 2), ('c', 3), ('d', 4), ('e', 5), ('a', 9)]

answered Oct 23, 2021 at 20:33

ListenSoftware Louise Ai Agent

4,3432 gold badges31 silver badges39 bronze badges

Comments

Charles Merriam · Accepted Answer · 2020-07-10 02:50:11Z

-2

Unfortunately, the highest rated answer, bidict does not work.

There are three options:

Subclass dict: You can create a subclass of dict, but beware. You need to write custom implementations ofupdate, pop, initializer, setdefault. The dict implementations do not call __setitem__. This is why the highest rated answer has issues.
Inherit from UserDict: This is just like a dict, except all the routines are made to call correctly. It uses a dict under the hood, in an item called data. You can read the Python Documentation, or use a simple implementation of a by directional list that works in Python 3. Sorry for not including it verbatim: I'm unsure of its copyright.
Inherit from Abstract Base Classes: Inheriting from collections.abc will help you get all the correct protocols and implementations for a new class. This is overkill for a bidirectional dictionary, unless it can also encrypt and cache to a database.

TL;DR -- Use this for your code. Read Trey Hunner's article for details.

answered Jul 10, 2020 at 2:50

Charles Merriam

20.7k7 gold badges77 silver badges84 bronze badges

3 Comments

tejasvi Over a year ago

Nice article. Still, what exactly does not work with bidict?

Charles Merriam Over a year ago

What did not work two years ago may or may not work now.

Chris Adams Over a year ago

It did work 2 years ago, it worked with Python 2.x, and it works with Python 3.10 now. Rather than distract with a mistake from whatever testing you did in 2020, it'd be better to focus on what your advice offers which is different since the mention of UserDict, etc. is useful and might point someone with more advanced needs in the right direction.

Collectives™ on Stack Overflow

How to implement an efficient bidirectional hash table?

8 Answers 8

18 Comments

12 Comments

11 Comments

1 Comment

3 Comments

Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

18 Comments

12 Comments

11 Comments

1 Comment

3 Comments

Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related