140

Python dict is a very useful data-structure:

d = {'a': 1, 'b': 2}

d['a'] # get 1

Sometimes you'd also like to index by values.

d[1] # get 'a'

Which is the most efficient way to implement this data-structure? Any official recommend way to do it?

6
  • If you prefer, we can assume that values are immutable as well as keys are. Commented Jul 23, 2010 at 13:39
  • 6
    What would you return for this dict: {'a' : 1, 'b': 2, 'A' : 1 } Commented Jul 23, 2010 at 16:49
  • 5
    @PaulMcGuire : I would return {1: ['a', 'A'], 2: 'b'}. See my answer for such a way to do it. Commented Feb 19, 2014 at 22:41
  • 5
    Note to moderator: this is not a duplicate of stackoverflow.com/questions/1456373/two-way-reverse-map. The latter has 1) very vague wording 2) no MCVE 3) only deals with the case of the bijective map (see first comment in this question), which is a lot more restrictive than this actual question, which is more general. So I think marking it as duplicate is here, in this particular case, misleading. If really one should be a duplicate of another, it should be the contrary as this one here covers the general case whereas the other (see answers) doesn't cover the non-bijective case. Commented Jun 19, 2018 at 9:13
  • 2
    This question is more than ten years old, but I am reading it for the first time now. You might find inspiration in the Java library Google Guava. They have a class HashBiMap that is worth reading. (I assume a similar structure could be implemented in Python.) The documentation clearly outlines edge cases and how they are handled. Ref: github.com/google/guava/blob/master/guava/src/com/google/common/… Commented Apr 27, 2021 at 5:59

8 Answers 8

113

Here is a class for a bidirectional dict, inspired by Finding key from value in Python dictionary and modified to allow the following 2) and 3).

Note that :

    1. The inverse directory bd.inverse auto-updates itself when the standard dict bd is modified.
    1. The inverse directory bd.inverse[value] is always a list of key such that bd[key] == value.
    1. Unlike the bidict module from https://pypi.python.org/pypi/bidict, here we can have 2 keys having same value, this is very important.

Code:

class bidict(dict):
    def __init__(self, *args, **kwargs):
        super(bidict, self).__init__(*args, **kwargs)
        self.inverse = {}
        for key, value in self.items():
            self.inverse.setdefault(value, []).append(key) 

    def __setitem__(self, key, value):
        if key in self:
            self.inverse[self[key]].remove(key) 
        super(bidict, self).__setitem__(key, value)
        self.inverse.setdefault(value, []).append(key)        

    def __delitem__(self, key):
        self.inverse.setdefault(self[key], []).remove(key)
        if self[key] in self.inverse and not self.inverse[self[key]]: 
            del self.inverse[self[key]]
        super(bidict, self).__delitem__(key)

Usage example:

bd = bidict({'a': 1, 'b': 2})  
print(bd)                     # {'a': 1, 'b': 2}                 
print(bd.inverse)             # {1: ['a'], 2: ['b']}
bd['c'] = 1                   # Now two keys have the same value (= 1)
print(bd)                     # {'a': 1, 'c': 1, 'b': 2}
print(bd.inverse)             # {1: ['a', 'c'], 2: ['b']}
del bd['c']
print(bd)                     # {'a': 1, 'b': 2}
print(bd.inverse)             # {1: ['a'], 2: ['b']}
del bd['a']
print(bd)                     # {'b': 2}
print(bd.inverse)             # {2: ['b']}
bd['b'] = 3
print(bd)                     # {'b': 3}
print(bd.inverse)             # {2: [], 3: ['b']}
Sign up to request clarification or add additional context in comments.

18 Comments

This is phenomenal. It's succinct; it's self-documenting; it's reasonably efficient; it just works. My only quibble would be to optimize the repeated lookups of self[key] in __delitem__() with a single value = self[key] assignment reused for such lookups. But... yeah. That's negligible. Thanks for the pure awesome, Basj!
How about a Python 3 version?
Ah. Right. Try it without "iter" and it should work.
I like this answer for the example. The accepted answer is correct still and I think the accepted answer should stay as the accepted answer, but this is a little bit more explicit for defining it yourself, merely because it clearly lays out that in order to reverse the dictionary you must place the reverse's values into a list since there cannot be a one-to-one mapping because a dictionary has a one-to-many relationship with key-to-values.
this doesn't support many to many in both directions. bd = bidict({'a': [1,2,3], 'b': [2,3,4]}) errors about unhashable type.
|
55

You can use the same dict itself by adding key,value pair in reverse order.

d={'a':1,'b':2}
revd=dict([reversed(i) for i in d.items()])
d.update(revd)

12 Comments

+1 A nice, practical solution. Another way to write it: d.update( dict((d[k], k) for k in d) ).
+1 For neat use of reversed(). I'm undecided if it's more readable than the explicit dict((v, k) for (k, v) in d.items()). In any case, you can pass pairs directly to .update: d.update(reversed(i) for i in d.items()).
Note this fails e.g. for d={'a':1, 'b':2, 1: 'b'}
Slight modification: dict(map(reversed, a_dict.items())).
Adding reverse mappings to the original dictionary is an awful idea. As the above comments demonstrate, doing so is not safe in the general case. Just maintain two separate dictionaries. Since the first two lines of this answer ignoring the trailing d.update(revd) are great, however, I'm still contemplating an upvote. Let's give this some thought.
|
46

A poor man's bidirectional hash table would be to use just two dictionaries (these are highly tuned datastructures already).

There is also a bidict package on the index:

The source for bidict can be found on github:

11 Comments

2 dicts requires double inserts and deletes.
@Juanjo: nearly any bidirectional/reversible hash table will involve "double inserts and deletes", either as part of implementing the structure, or as part of using it. Keeping two indexes is really the only fast way to do it, AFAIK.
Of course; I meant that take care of the 2 index by hand is the problem.
@Basj I think it's correct that it's not accepted since having more than one value means it's not a bijection anymore and is ambiguous for the reverse lookup.
@Basj Well, I can understand that there would be use cases which would be useful to have more than one value per key, so maybe this type of data structure should exist as a subclass of bidict. However, since a normal dict maps to a single object, I think it makes much more sense for the reverse to be the same as well. (Just to clarify, although the value can be a collection too, I meant that the key of the first dict should be of the same type as the value of the reverse dict)
|
12

The below snippet of code implements an invertible (bijective) map:

class BijectionError(Exception):
    """Must set a unique value in a BijectiveMap."""

    def __init__(self, value):
        self.value = value
        msg = 'The value "{}" is already in the mapping.'
        super().__init__(msg.format(value))


class BijectiveMap(dict):
    """Invertible map."""

    def __init__(self, inverse=None):
        if inverse is None:
            inverse = self.__class__(inverse=self)
        self.inverse = inverse

    def __setitem__(self, key, value):
        if value in self.inverse:
            raise BijectionError(value)

        self.inverse._set_item(value, key)
        self._set_item(key, value)

    def __delitem__(self, key):
        self.inverse._del_item(self[key])
        self._del_item(key)

    def _del_item(self, key):
        super().__delitem__(key)

    def _set_item(self, key, value):
        super().__setitem__(key, value)

The advantage of this implementation is that the inverse attribute of a BijectiveMap is again a BijectiveMap. Therefore you can do things like:

>>> foo = BijectiveMap()
>>> foo['steve'] = 42
>>> foo.inverse
{42: 'steve'}
>>> foo.inverse.inverse
{'steve': 42}
>>> foo.inverse.inverse is foo
True

1 Comment

This also ensures that modifying foo.inverse doesn't leave foo in an inconsistent state, unlike Basj's answer
1

Something like this, maybe:

import itertools

class BidirDict(dict):
    def __init__(self, iterable=(), **kwargs):
        self.update(iterable, **kwargs)
    def update(self, iterable=(), **kwargs):
        if hasattr(iterable, 'iteritems'):
            iterable = iterable.iteritems()
        for (key, value) in itertools.chain(iterable, kwargs.iteritems()):
            self[key] = value
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        if value in self:
            del self[value]
        dict.__setitem__(self, key, value)
        dict.__setitem__(self, value, key)
    def __delitem__(self, key):
        value = self[key]
        dict.__delitem__(self, key)
        dict.__delitem__(self, value)
    def __repr__(self):
        return '%s(%s)' % (type(self).__name__, dict.__repr__(self))

You have to decide what you want to happen if more than one key has a given value; the bidirectionality of a given pair could easily be clobbered by some later pair you inserted. I implemented one possible choice.


Example :

bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'})
print bd['myvalue1']   # a
print bd['myvalue2']   # b        

3 Comments

I'm not sure if this is a problem, but using the above implementation, wouldn't there be issues if the keys and values overlapped? So dict([('a', 'b'), ('b', 'c')]); dict['b'] -> 'c' instead of the key 'a'.
It's not an issue for the OP's example, but might be a good disclaimer to include.
How can we do that print bd['myvalue2'] answers b, c (or [b, c], or (b, c), or anything else) ?
1

First, you have to make sure the key to value mapping is one to one, otherwise, it is not possible to build a bidirectional map.

Second, how large is the dataset? If there is not much data, just use 2 separate maps, and update both of them when updating. Or better, use an existing solution like Bidict, which is just a wrapper of 2 dicts, with updating/deletion built in.

But if the dataset is large, and maintaining 2 dicts is not desirable:

  • If both key and value are numeric, consider the possibility of using Interpolation to approximate the mapping. If the vast majority of the key-value pairs can be covered by the mapping function (and its
    reverse function), then you only need to record the outliers in maps.

  • If most of access is uni-directional (key->value), then it is totally ok to build the reverse map incrementally, to trade time for
    space.

Code:

d = {1: "one", 2: "two" }
reverse = {}

def get_key_by_value(v):
    if v not in reverse:
        for _k, _v in d.items():
           if _v == v:
               reverse[_v] = _k
               break
    return reverse[v]

Comments

1

a better way is convert the dictionary to a list of tuples then sort on a specific tuple field

def convert_to_list(dictionary):
    list_of_tuples = []
    for key, value in dictionary.items():
        list_of_tuples.append((key, value))
    return list_of_tuples

def sort_list(list_of_tuples, field):
     return sorted(list_of_tuples, key=lambda x: x[field])

dictionary = {'a': 9, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
list_of_tuples = convert_to_list(dictionary)
print(sort_list(list_of_tuples, 1))

output

[('b', 2), ('c', 3), ('d', 4), ('e', 5), ('a', 9)]

Comments

-2

Unfortunately, the highest rated answer, bidict does not work.

There are three options:

  1. Subclass dict: You can create a subclass of dict, but beware. You need to write custom implementations ofupdate, pop, initializer, setdefault. The dict implementations do not call __setitem__. This is why the highest rated answer has issues.

  2. Inherit from UserDict: This is just like a dict, except all the routines are made to call correctly. It uses a dict under the hood, in an item called data. You can read the Python Documentation, or use a simple implementation of a by directional list that works in Python 3. Sorry for not including it verbatim: I'm unsure of its copyright.

  3. Inherit from Abstract Base Classes: Inheriting from collections.abc will help you get all the correct protocols and implementations for a new class. This is overkill for a bidirectional dictionary, unless it can also encrypt and cache to a database.

TL;DR -- Use this for your code. Read Trey Hunner's article for details.

3 Comments

Nice article. Still, what exactly does not work with bidict?
What did not work two years ago may or may not work now.
It did work 2 years ago, it worked with Python 2.x, and it works with Python 3.10 now. Rather than distract with a mistake from whatever testing you did in 2020, it'd be better to focus on what your advice offers which is different since the mention of UserDict, etc. is useful and might point someone with more advanced needs in the right direction.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.