2

In the following code, I have run into a RecursionError: maximum recursion depth exceeded.

def unpack(given):
    for i in given:
        if hasattr(i, '__iter__'):
            yield from unpack(i)
        else:
            yield i

some_list = ['a', ['b', 'c'], 'd']

unpacked = list(unpack(some_list))

This works fine if I use some_list = [1, [2, [3]]], but not when I try it with strings.

I suspect my lack of knowledge in python. Any guidance appreciated.

4
  • 2
    Strings are themselves iterable, and the things they iterate over are strings. for x in 'a' yields 'a' itself. Commented Apr 4, 2018 at 2:47
  • 1
    Try some_list = []; some_list.append(some_list); unpacked = list(unpack(some_list)) to see that this can happen with anything with depth>1000. So the remaining question is why every string has depth>1000, which wim's answer (and BallpointBen's comment) explains. Commented Apr 4, 2018 at 3:01
  • @abarnert For your case, is it that __iter__ for list returns itself, and naturally it's unending recursion? Commented Apr 4, 2018 at 3:10
  • @user7865286 Yes—or, maybe more simply, that the list contains itself: some_list[0] is some_list. I thought this would be less surprising than the fact that if s = 'a', s[0] is s, so it would help illuminate the problem, but now that I think about it, how many people actually know about recursive lists in Python? The only really obvious example would be a class that explicitly iterates itself, which is too big and distracting to be worth commenting about; better to just go straight to s[0] is s for strings as BallpointBen did. Commented Apr 4, 2018 at 3:22

1 Answer 1

7

Strings are infinitely iterable. Even a one-character string is iterable.

Therefore, you will always get a stack overflow unless you add special handling for strings:

def flatten(x):
    try:
        it = iter(x)
    except TypeError:
        yield x
        return
    if isinstance(x, (str, bytes)):
        yield x
        return
    for elem in it:
        yield from flatten(elem)

Note: using hasattr(i, '__iter__') is not sufficient to check if i is iterable, because there are other ways to satisfy the iterator protocol. The only reliable way to determine whether an object is iterable is to call iter(obj).

Sign up to request clarification or add additional context in comments.

8 Comments

It could be just a comment.
"using hasattr(i, '__iter__') is not sufficient to check if i is iterable" - true, but isinstance(x, Iterable) fails for many of the same cases. The most reliable check is to just call iter.
@user2357112 is right. In particular, the "old-style sequence iteration protocol" (which is actually a pair of related incomplete protocols, but never mind that) isn't captured by either test. Just define a class with a __len__ that returns an int (and make sure it's in range(0, 1<<32)), and a __getitem__ that doesn't raise IndexError for any value in range(0, __len__()) but does for __len__(), and you've got an iterable but not an Iterable.
What do you think is a more Pythonic way to implement flatten? I almost always prefer EAFP, but somehow I always come back to LBYL for this one. For example - which exception(s) to catch, is just TypeError enough? It's hard to say.
isinstance(x, Iterable) is definitely an improvement over hasattr(x, '__iter__'), at least. It'll pick up "anti-registration" through __iter__ = None, and it'll detect isinstance("asdf", Iterable) in Python 2 (though only due to a register call).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.