Concatenation of many lists in Python [duplicate]

Question

Suppose I have a function like this:

def getNeighbors(vertex)

which returns a list of vertices that are neighbors of the given vertex. Now I want to create a list with all the neighbors of the neighbors. I do that like this:

listOfNeighborsNeighbors = []
for neighborVertex in getNeighbors(vertex):
    listOfNeighborsNeighbors.append(getNeighbors(neighborsVertex))

Is there a more pythonic way to do that?

I think both the duplicate and this question choose the wrong answer, though. See here for the more pythonic/performant answer. — Mateen Ulhaq
– Mateen Ulhaq, Commented Nov 30, 2018 at 1:52

Jochen · Accepted Answer · 2010-06-11 10:01:33Z

79

As usual, the itertools module contains a solution:

>>> l1=[1, 2, 3]

>>> l2=[4, 5, 6]

>>> l3=[7, 8, 9]

>>> import itertools

>>> list(itertools.chain(l1, l2, l3))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

answered Jun 11, 2010 at 10:01

Jochen

9015 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

OrangeDog Over a year ago

Therefore the solution to the question is list(itertools.chain.from_iterable(getNeighbors(n) for n in getNeighbors(vertex)))

Joel Sjögren Over a year ago

If ls = [l1,l2,l3] use list(itertools.chain(*ls)).

Ignacio Vazquez-Abrams · Accepted Answer · 2010-06-11 09:27:31Z

56

[x for n in getNeighbors(vertex) for x in getNeighbors(n)]

or

sum(getNeighbors(n) for n in getNeighbors(vertex), [])

answered Jun 11, 2010 at 9:27

Ignacio Vazquez-Abrams

804k160 gold badges1.4k silver badges1.4k bronze badges

4 Comments

Evan Plaice Over a year ago

+1 I was going to suggest a list comprehension. IMHO, it's the most pythonic way.

ToolmakerSteve Over a year ago

However, see the timing comparisons, as comments under emu's answer: both "itertools.chain" and "reduce(iadd" are more than twice as fast as the nested list comprehension -- and MUCH faster than sum(), which degrades rapidly with # elements processed.

Guillaume Chevalier Over a year ago

So glad I found this. Tried many times, never with such a 2nd argument [] to the sum of lists.

Maxim Imakaev Over a year ago

Second solution looks very cool. And works in practice. And it cost me hours of profiling and debugging because it just doesn't work for large N! Please put a note that the second solution has quadratic time complexity!

Sjoerd · Accepted Answer · 2010-06-11 09:28:08Z

45

Appending lists can be done with + and sum():

>>> c = [[1, 2], [3, 4]]
>>> sum(c, [])
[1, 2, 3, 4]

answered Jun 11, 2010 at 9:28

Sjoerd

75.9k16 gold badges140 silver badges180 bronze badges

5 Comments

ToolmakerSteve Over a year ago

Thanks - I knew there had to be some way to do this with sum! BTW, it wasn't clear to me that this would work with more than 2 sub-lists, or variable length lists; so clearer example might be: c = [[1, 2], [3, 4, 5], [6, 7]] => [1, 2, 3, 4, 5, 6, 7]

ToolmakerSteve Over a year ago

BUT see the timings I did as comments under emu's answer. DO NOT USE SUM -- VERY SLOW FOR 100 lists of 100 items!

KeithWM Over a year ago

Why is the second argument to sum required? I would think sum([[1, 2], [3, 4]]) was clear as day to mean [1, 2] + [3, 4].

Stef Over a year ago

@KeithWM Because sum([[1, 2], [3, 4]]) does not mean [1, 2] + [3, 4], but rather 0 + [1, 2] + [3, 4], which doesn't work. You need the optional second argument to replace that starting 0 with a [], so that sum([[1, 2], [3, 4]], []) is [] + [1, 2] + [3, 4].

KeithWM Over a year ago

@Stef Thank you very much! That explains a lot of errors I've experienced in the past while using sum.

Yariv · Accepted Answer · 2021-11-17 07:53:40Z

16

Quickest to slowest:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain.from_iterable(list_of_lists))
30 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit list(itertools.chain(*list_of_lists))
33.4 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

min(timeit.repeat("ll=[];\nfor l in list_of_lists:\n ll.extend(l)", "list_of_lists=[[x,1] for x in range(1000)]",repeat=3, number=100))/100.0
4.1411130223423245e-05

%timeit [y for z in list_of_lists for y in z]
53.9 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit sum(list_of_lists, [])
1.5 ms ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

(Python 3.7.10)

Python2:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain(*list_of_lists))
100000 loops, best of 3: 14.6 µs per loop

%timeit list(itertools.chain.from_iterable(list_of_lists))
10000 loops, best of 3: 60.2 µs per loop

min(timeit.repeat("ll=[];\nfor l in list_of_lists:\n ll.extend(l)", "list_of_lists=[[x,1] for x in xrange(1000)]",repeat=3, number=100))/100.0
9.620904922485351e-05

%timeit [y for z in list_of_lists for y in z]
10000 loops, best of 3: 108 µs per loop

%timeit sum(list_of_lists, [])
100 loops, best of 3: 3.7 ms per loop

edited Nov 17, 2021 at 7:53

answered Oct 22, 2015 at 9:16

Yariv

3891 gold badge3 silver badges11 bronze badges

3 Comments

interjay Over a year ago

itertools.chain(list_of_lists) is wrong (it won't concatenate anything because it's only given one parameter). You need a * there, or chain.from_iterable.

Amnon Harel Over a year ago

These timing results might be obsolete. Testing on 2018 HW with python3.6.6, I don't see any reproducible speed difference between the itertools.chain, itertools.chain.from_iterable, and functools.reduce/iadd solutions. YMMV. The iadd solution changes the inputs, though.

Benjamin Atkin Over a year ago

nobody said anything about the list of lists being dynamic so the from_iterable solution is off-topic imo

emu · Accepted Answer · 2012-07-31 11:56:43Z

13

If speed matters, it may be better to use this:

from operator import iadd
reduce(iadd, (getNeighbors(n) for n in getNeighbors(vertex)))

The point of this code is in concatenating whole lists by list.extend where list comprehension would add one item by one, as if calling list.append. That saves a bit of overhead, making the former (according to my measurements) about three times faster. (The iadd operator is normally written as += and does the same thing as list.extend.)

Using list comprehensions (the first solution by Ignacio) is still usually the right way, it is easier to read.

But definitely avoid using sum(..., []), because it runs in quadratic time. That is very impractical for many lists (more than a hundred or so).

answered Jul 31, 2012 at 11:56

emu

1,71318 silver badges20 bronze badges

8 Comments

ToolmakerSteve Over a year ago

Thanks for the comment re sum's performance -- I liked how compact that code is, so good to know not to use it on large scale. IMHO, Jochen's itertools'chain solution from '10 is a more appropriate solution than reduce: it more directly/simply does what is being asked for.

ToolmakerSteve Over a year ago

WARNING: iadd MODIFIES the first list passed in. Doesn't matter in the example, because the lists are results from a function. But I did one test where I passed in a list of lists that I had pre-computed. Altered my original list, which was not good to do. FIX: instead of reduce(iadd, LL) or even reduce(iadd, (L for L in LL)), must wrap each returned L in list(): reduce(iadd, (list(L) for L in LL)). This forces each L to be copied. (Which is quick, because size is known.).

ToolmakerSteve Over a year ago

.. List comprehension degrades more quickly (2.4 => 9.1). Sum is WAY worse (13.8 => 130.2)! Repeating those numbers together for easier comparison: (reduce, chain, comprehension, sum) @ 100x100 = (1.1, 1.1, 2.6, 13.8); @ 200x200 = (2.6, 4.0, 9.1, 130.2).

ToolmakerSteve Over a year ago

Test code (python 2.7):

print timeit('all = reduce(operator.iadd, (list(list_) for list_ in LL))', number=1000, setup='n = 100;  import operator;  L1 = list(range(n));  LL = [[10 * x + v for v in L1] for x in range(n)]')

print timeit('all = list(itertools.chain(*LL))', number=1000, setup='n = 100;  L1 = list(range(n));  LL = [[10 * x + v for v in L1] for x in range(n)]')

print timeit('all = [x for list_ in LL for x in list_]', number=... print timeit('all = sum(LL, [])', number=... THEN repeat those 4, with n = 200; instead of 100. (Then I multiplied the resulting times by 10)

emu Over a year ago

@drevicko Because it has no choice but to construct a new list during each addition, and that is a linear-time operation.

|

renadeen · Accepted Answer · 2015-05-29 17:50:11Z

3

I like itertools.chain approach because it runs in linear time (sum(...) runs in qudratic time) but @Jochen didn't show how to deal with lists of dynamic length. Here is solution for OP's question.

import itertools
list(itertools.chain(*[getNeighbors(n) for n in getNeighbors(vertex)]))

You can get rid of list(...) call if iterable is sufficient for you.

answered May 29, 2015 at 17:50

renadeen

1,84418 silver badges18 bronze badges

2 Comments

emu Over a year ago

You can also get rid of the unpacking *[getNeighbors...] by using chain.from_iterable like this: list(itertools.chain.from_iterable(getNeighbors(n) for n in getNeighbors(vertex)))

Benjamin Atkin Over a year ago

Or you can keep the unpacking but not have it generate a list by doing list(itertools.chain(*(getNeighbors(n) for n in getNeighbors(vertex))))

realmaniek · Accepted Answer · 2017-11-06 10:03:46Z

0

Using .extend() (update in place) combined with reduce instead of sum() (new object each time) should be more efficient however I'm too lazy to test that :)

mylist = [[1,2], [3,4], [5,6]] 
reduce(lambda acc_l, sl: acc_l.extend(sl) or acc_l, mylist)

answered Nov 6, 2017 at 10:03

realmaniek

5131 gold badge5 silver badges12 bronze badges

1 Comment

Björn Pollex Over a year ago

It is indeed faster, but as Yariv's answer shows, it's not the fastest approach.

Collectives™ on Stack Overflow

Concatenation of many lists in Python [duplicate]

7 Answers 7

2 Comments

4 Comments

5 Comments

3 Comments

8 Comments

2 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

2 Comments

4 Comments

5 Comments

3 Comments

8 Comments

2 Comments

1 Comment

Linked

Related