Python3 pass lists to function with functools.lru_cache

Question

I want to cache a function that takes a list as a parameter, but when I try to do so with the functools.lru_cache decorator, it fails with TypeError: unhashable type: 'list'.


import functools

@functools.lru_cache()
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)


print(example_func([1, 2]))

@Alex just putting this here because googling this ("lrucache python list") didn't find a lot. I then made a custom class with a custom hash function. I later asked this to a professional Python dev, and he suggested using a tuple. I do think these two questions are related, but not duplicates. — redfast00
– redfast00, Commented Mar 10, 2018 at 20:12

Aziz Alto · Accepted Answer · 2020-08-11 20:44:43Z

76

This fails because a list is unhashable. This would make it hard for Python to know what values are cached. A way to fix this is by converting lists to tuples before passing them to a cached function: since tuples are immutable and hashable, they can be cached.

TL;DR

Use a tuple instead of a list:

>>> @lru_cache(maxsize=2)
... def my_function(args):
...     pass
...
>>> my_function([1,2,3])
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    my_function([1,2,3])
TypeError: unhashable type: 'list'

>>> # TO FIX: use a tuple 

>>> my_function(tuple([1,2,3]))
>>>

edited Aug 11, 2020 at 20:44

Aziz Alto

20.7k5 gold badges82 silver badges63 bronze badges

answered Mar 10, 2018 at 15:31

redfast00

1,7431 gold badge15 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Noumenon Over a year ago

If using mypy and the list could be any length, you'll have to annotate the type as a homogenous variable length tuple: Tuple[int, ...]. See stackoverflow.com/a/54747327/733092.

Nico Schlömer · Accepted Answer · 2024-01-06 19:25:53Z

20

It should not throw an error, rather convert into hash-able form within decorator without user even knowing it. You can fix this problem by decorating your functions like this:

# Custom Decorator function
def list_to_tuple(function):
    def wrapper(*args):
        args = [tuple(x) if isinstance(x, list) else x for x in args]
        result = function(*args)
        result = tuple(result) if isinstance(result, list) else result
        return result
    return wrapper

# your cached function
@list_to_tuple
@lru_cache(maxsize=cacheMaxSize)
def checkIfAdminAcquired(self, admin_id) -> list:
    query = "SELECT id FROM public.admins WHERE id IN ({}) and confirmed_at IS NOT NULL"
    response = self.handleQuery(query, "int", admin_id)
    return response

You might want to use yet another decorator after lru_cache to make sure that output of the function is not a tuple, but a list, since right now it will return tuple.

edited Jan 6, 2024 at 19:25

Nico Schlömer

59.7k35 gold badges216 silver badges291 bronze badges

answered Apr 1, 2020 at 21:32

Donatas Svilpa

3412 silver badges3 bronze badges

3 Comments

dwitvliet Over a year ago

Isn't it sufficient to convert args to a tuple? What is the purpose of also converting result?

ahmed_khan_89 Over a year ago

i guess because the cache is storing as <key,value> the combination of <key=function+args,value=result> so if your results is also not hashable, better make it hashable as well...

Chris Mungall Over a year ago

To extend this for named arguments, add this kwargs = {k: tuple(x) if type(x) == list else x for k, x in kwargs.items()} to the function and modify the result assignment to result = function(*args, **kwargs)

user2561747 · Accepted Answer · 2020-09-29 00:24:10Z

Sometimes a parameter can take either a simple hashable type, or a complicated unhashable type without a straightforward conversion to be hashable, as the current answers propose. In this situation it may still be desirable to have a cache used for the (possibly more common) case of hashable type without using a cache or erroring out in the unhashable case - simply calling the underlying function.

This ignores the error and works generally for any hashable type:

import functools

def ignore_unhashable(func): 
    uncached = func.__wrapped__
    attributes = functools.WRAPPER_ASSIGNMENTS + ('cache_info', 'cache_clear')
    @functools.wraps(func, assigned=attributes) 
    def wrapper(*args, **kwargs): 
        try: 
            return func(*args, **kwargs) 
        except TypeError as error: 
            if 'unhashable type' in str(error): 
                return uncached(*args, **kwargs) 
            raise 
    wrapper.__uncached__ = uncached
    return wrapper

Usage and testing:

@ignore_unhashable
@functools.lru_cache()
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)

example_func([1, 2]) # 6
example_func.cache_info()
# CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)
example_func((1, 2)) # 6
example_func.cache_info()
# CacheInfo(hits=0, misses=1, maxsize=128, currsize=1)
example_func((1, 2)) # 6
example_func.cache_info()
# CacheInfo(hits=1, misses=1, maxsize=128, currsize=1)

Took me a moment to wrap my head around it, but example_func.__wrapped__ is the lru_cache's version and example_func.__uncached__ is the original version.

El Houcine Es-sanhaji · Accepted Answer · 2023-10-04 14:54:06Z

3

The answer from @Donatas-Svilpa works for me, However, it needs to cover the keywords-arguments option as well.

def list_to_tuple(function: Callable) -> Any:
    """Custom decorator function, to convert list to a tuple."""

    def wrapper(*args, **kwargs) -> Any:
        args = tuple(tuple(x) if isinstance(x, list) else x for x in args)
        kwargs = {k: tuple(v) if isinstance(v, list) else v for k, v in kwargs.items()}
        result = function(*args, **kwargs)
        result = tuple(result) if isinstance(result, list) else result
        return result

    return wrapper

Then we could use that way

@list_to_tuple
def do_some_work(arg_1: List[Any], arg_2: int) -> List[Any]:
    pass

answered Oct 4, 2023 at 14:54

El Houcine Es-sanhaji

1212 silver badges6 bronze badges

Comments

D Kim · Accepted Answer · 2024-03-14 04:19:19Z

You can extend functools.lru_cache to digest lists, dicts, and more. The key idea is passing a hashed value of arguments to lru_cache, not the raw arguments. The below is an exemplary implementation hashing lists and dicts in arguments.

from typing import Callable, TypeVar, Any
from typing_extensions import ParamSpec

from functools import lru_cache, _CacheInfo

def hash_list(l: list) -> int:
    __hash = 0
    for i, e in enumerate(l):
        __hash = hash((__hash, i, hash_item(e)))
    return __hash

def hash_dict(d: dict) -> int:
    __hash = 0
    for k, v in d.items():
        __hash = hash((__hash, k, hash_item(v)))
    return __hash

def hash_item(e) -> int:
    if hasattr(e, '__hash__') and callable(e.__hash__):
        try:
            return hash(e)
        except TypeError:
            pass
    if isinstance(e, (list, set, tuple)):
        return hash_list(list(e))
    elif isinstance(e, (dict)):
        return hash_dict(e)
    else:
        raise TypeError(f'unhashable type: {e.__class__}')

PT = ParamSpec("PT")
RT = TypeVar("RT")

def lru_cache_ext(*opts, hashfunc: Callable[..., int] = hash_item, **kwopts) -> Callable[[Callable[PT, RT]], Callable[PT, RT]]:
    def decorator(func: Callable[PT, RT]) -> Callable[PT, RT]:
        class _lru_cache_ext_wrapper:
            args: tuple
            kwargs: dict[str, Any]

            def cache_info(self) -> _CacheInfo: ...
            def cache_clear(self) -> None: ...

            @classmethod
            @lru_cache(*opts, **kwopts)
            def cached_func(cls, args_hash: int) -> RT:
                return func(*cls.args, **cls.kwargs)

            @classmethod
            def __call__(cls, *args: PT.args, **kwargs: PT.kwargs) -> RT:
                __hash = hashfunc((id(func), *[hashfunc(a) for a in args], *[(hashfunc(k), hashfunc(v)) for k, v in kwargs.items()]))

                cls.args = args
                cls.kwargs = kwargs

                cls.cache_info = cls.cached_func.cache_info
                cls.cache_clear = cls.cached_func.cache_clear

                return cls.cached_func(__hash)

        return _lru_cache_ext_wrapper()

    return decorator

Using lru_cache_ext is exactly the same with original lru_cache.

@lru_cache_ext(maxsize=None)
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)


print(example_func([1, 2]))
print(example_func([1, 2]))
print(example_func([1, 2]))

print(example_func.cache_info())

The above code will print:

6
6
6
CacheInfo(hits=2, misses=1, maxsize=None, currsize=1)

Wrong ! If you do print(cached_func.cache_info()) on every call, you'll evidence that no caching is happening. It is because you're creating a new cache function on every call, hence, it is doing nothing.
@AvinashThakur Thanks! The example is updated to avoid that problem by creating a persistent object instead of defining functions on every calls.

Joakim · Accepted Answer · 2024-11-05 11:06:42Z

1

Another great option is to use the memoization library, which is a wrapper around @lru_cache that supports supports unhashable arguments. Example:

Run pip install memoization, then:

from memoization import cached

@cached
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)

answered Nov 5, 2024 at 11:06

Joakim

2,1722 gold badges22 silver badges25 bronze badges

Comments

Yin · Accepted Answer · 2021-11-16 08:29:00Z

-1

If you do not need LRU, and all arguments are references, you may use this simple implementation.

import time


def cacheRef(f):
    cache = {}

    def g(*args):
        # use `id` to get memory address for function argument.
        cache_key = '-'.join(list(map(lambda e: str(id(e)), args)))
        if cache_key in cache:
            return cache[cache_key]
        v = f(*args)
        cache[cache_key] = v
        return v

    return g


@cacheRef
def someHeavyWork(p1):
    time.sleep(3)
    return ''.join(p1)


l1 = ['a', 'b', 'c']
l2 = ['d', 'e', 'f']

t0 = time.time()
print(int(time.time() - t0), someHeavyWork(l1))
print(int(time.time() - t0), someHeavyWork(l1))
print(int(time.time() - t0), someHeavyWork(l1))
print(int(time.time() - t0), someHeavyWork(l2))
print(int(time.time() - t0), someHeavyWork(l2))
print(int(time.time() - t0), someHeavyWork(l2))

'''
output:
0 abc
3 abc
3 abc
3 def
6 def
6 def
'''

edited Nov 16, 2021 at 8:29

answered Nov 12, 2021 at 10:21

Yin

6338 silver badges10 bronze badges

2 Comments

redfast00 Over a year ago

This does not use functools.lru_cache, so for example setting a maxsize does not work. Also, seq does not seem to be a built-in in Python 3, where do you get that function from?

Yin Over a year ago

@redfast00 thanks for advice. my code only works for some situations, I added the situations.

Collectives™ on Stack Overflow

Python3 pass lists to function with functools.lru_cache

7 Answers 7

TL;DR

1 Comment

3 Comments

Comments

Comments

2 Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

TL;DR

1 Comment

3 Comments

Comments

Comments

2 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related