55

Suppose I have a function like this:

def getNeighbors(vertex)

which returns a list of vertices that are neighbors of the given vertex. Now I want to create a list with all the neighbors of the neighbors. I do that like this:

listOfNeighborsNeighbors = []
for neighborVertex in getNeighbors(vertex):
    listOfNeighborsNeighbors.append(getNeighbors(neighborsVertex))

Is there a more pythonic way to do that?

1

7 Answers 7

79

As usual, the itertools module contains a solution:

>>> l1=[1, 2, 3]

>>> l2=[4, 5, 6]

>>> l3=[7, 8, 9]

>>> import itertools

>>> list(itertools.chain(l1, l2, l3))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Sign up to request clarification or add additional context in comments.

2 Comments

Therefore the solution to the question is list(itertools.chain.from_iterable(getNeighbors(n) for n in getNeighbors(vertex)))
If ls = [l1,l2,l3] use list(itertools.chain(*ls)).
56
[x for n in getNeighbors(vertex) for x in getNeighbors(n)]

or

sum(getNeighbors(n) for n in getNeighbors(vertex), [])

4 Comments

+1 I was going to suggest a list comprehension. IMHO, it's the most pythonic way.
However, see the timing comparisons, as comments under emu's answer: both "itertools.chain" and "reduce(iadd" are more than twice as fast as the nested list comprehension -- and MUCH faster than sum(), which degrades rapidly with # elements processed.
So glad I found this. Tried many times, never with such a 2nd argument [] to the sum of lists.
Second solution looks very cool. And works in practice. And it cost me hours of profiling and debugging because it just doesn't work for large N! Please put a note that the second solution has quadratic time complexity!
45

Appending lists can be done with + and sum():

>>> c = [[1, 2], [3, 4]]
>>> sum(c, [])
[1, 2, 3, 4]

5 Comments

Thanks - I knew there had to be some way to do this with sum! BTW, it wasn't clear to me that this would work with more than 2 sub-lists, or variable length lists; so clearer example might be: c = [[1, 2], [3, 4, 5], [6, 7]] => [1, 2, 3, 4, 5, 6, 7]
BUT see the timings I did as comments under emu's answer. DO NOT USE SUM -- VERY SLOW FOR 100 lists of 100 items!
Why is the second argument to sum required? I would think sum([[1, 2], [3, 4]]) was clear as day to mean [1, 2] + [3, 4].
@KeithWM Because sum([[1, 2], [3, 4]]) does not mean [1, 2] + [3, 4], but rather 0 + [1, 2] + [3, 4], which doesn't work. You need the optional second argument to replace that starting 0 with a [], so that sum([[1, 2], [3, 4]], []) is [] + [1, 2] + [3, 4].
@Stef Thank you very much! That explains a lot of errors I've experienced in the past while using sum.
16

Quickest to slowest:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain.from_iterable(list_of_lists))
30 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit list(itertools.chain(*list_of_lists))
33.4 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

min(timeit.repeat("ll=[];\nfor l in list_of_lists:\n ll.extend(l)", "list_of_lists=[[x,1] for x in range(1000)]",repeat=3, number=100))/100.0
4.1411130223423245e-05

%timeit [y for z in list_of_lists for y in z]
53.9 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit sum(list_of_lists, [])
1.5 ms ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

(Python 3.7.10)

Python2:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain(*list_of_lists))
100000 loops, best of 3: 14.6 µs per loop

%timeit list(itertools.chain.from_iterable(list_of_lists))
10000 loops, best of 3: 60.2 µs per loop

min(timeit.repeat("ll=[];\nfor l in list_of_lists:\n ll.extend(l)", "list_of_lists=[[x,1] for x in xrange(1000)]",repeat=3, number=100))/100.0
9.620904922485351e-05

%timeit [y for z in list_of_lists for y in z]
10000 loops, best of 3: 108 µs per loop

%timeit sum(list_of_lists, [])
100 loops, best of 3: 3.7 ms per loop

3 Comments

itertools.chain(list_of_lists) is wrong (it won't concatenate anything because it's only given one parameter). You need a * there, or chain.from_iterable.
These timing results might be obsolete. Testing on 2018 HW with python3.6.6, I don't see any reproducible speed difference between the itertools.chain, itertools.chain.from_iterable, and functools.reduce/iadd solutions. YMMV. The iadd solution changes the inputs, though.
nobody said anything about the list of lists being dynamic so the from_iterable solution is off-topic imo
13

If speed matters, it may be better to use this:

from operator import iadd
reduce(iadd, (getNeighbors(n) for n in getNeighbors(vertex)))

The point of this code is in concatenating whole lists by list.extend where list comprehension would add one item by one, as if calling list.append. That saves a bit of overhead, making the former (according to my measurements) about three times faster. (The iadd operator is normally written as += and does the same thing as list.extend.)

Using list comprehensions (the first solution by Ignacio) is still usually the right way, it is easier to read.

But definitely avoid using sum(..., []), because it runs in quadratic time. That is very impractical for many lists (more than a hundred or so).

8 Comments

Thanks for the comment re sum's performance -- I liked how compact that code is, so good to know not to use it on large scale. IMHO, Jochen's itertools'chain solution from '10 is a more appropriate solution than reduce: it more directly/simply does what is being asked for.
WARNING: iadd MODIFIES the first list passed in. Doesn't matter in the example, because the lists are results from a function. But I did one test where I passed in a list of lists that I had pre-computed. Altered my original list, which was not good to do. FIX: instead of reduce(iadd, LL) or even reduce(iadd, (L for L in LL)), must wrap each returned L in list(): reduce(iadd, (list(L) for L in LL)). This forces each L to be copied. (Which is quick, because size is known.).
.. List comprehension degrades more quickly (2.4 => 9.1). Sum is WAY worse (13.8 => 130.2)! Repeating those numbers together for easier comparison: (reduce, chain, comprehension, sum) @ 100x100 = (1.1, 1.1, 2.6, 13.8); @ 200x200 = (2.6, 4.0, 9.1, 130.2).
Test code (python 2.7): print timeit('all = reduce(operator.iadd, (list(list_) for list_ in LL))', number=1000, setup='n = 100; import operator; L1 = list(range(n)); LL = [[10 * x + v for v in L1] for x in range(n)]') print timeit('all = list(itertools.chain(*LL))', number=1000, setup='n = 100; L1 = list(range(n)); LL = [[10 * x + v for v in L1] for x in range(n)]') print timeit('all = [x for list_ in LL for x in list_]', number=... print timeit('all = sum(LL, [])', number=... THEN repeat those 4, with n = 200; instead of 100. (Then I multiplied the resulting times by 10)
@drevicko Because it has no choice but to construct a new list during each addition, and that is a linear-time operation.
|
3

I like itertools.chain approach because it runs in linear time (sum(...) runs in qudratic time) but @Jochen didn't show how to deal with lists of dynamic length. Here is solution for OP's question.

import itertools
list(itertools.chain(*[getNeighbors(n) for n in getNeighbors(vertex)]))

You can get rid of list(...) call if iterable is sufficient for you.

2 Comments

You can also get rid of the unpacking *[getNeighbors...] by using chain.from_iterable like this: list(itertools.chain.from_iterable(getNeighbors(n) for n in getNeighbors(vertex)))
Or you can keep the unpacking but not have it generate a list by doing list(itertools.chain(*(getNeighbors(n) for n in getNeighbors(vertex))))
0

Using .extend() (update in place) combined with reduce instead of sum() (new object each time) should be more efficient however I'm too lazy to test that :)

mylist = [[1,2], [3,4], [5,6]] 
reduce(lambda acc_l, sl: acc_l.extend(sl) or acc_l, mylist)

1 Comment

It is indeed faster, but as Yariv's answer shows, it's not the fastest approach.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.