join list of lists in python [duplicate]

Question

Is the a short syntax for joining a list of lists into a single list( or iterator) in python?

For example I have a list as follows and I want to iterate over a,b and c.

x = [["a","b"], ["c"]]

The best I can come up with is as follows.

result = []
[ result.extend(el) for el in x] 

for el in result:
  print el

Duplicate: stackoverflow.com/questions/120886/…, stackoverflow.com/questions/406121/… — S.Lott
– S.Lott, Commented Apr 4, 2009 at 11:52

Stefan · Accepted Answer · 2022-12-15 13:52:22Z

737

import itertools
a = [['a','b'], ['c']]
print(list(itertools.chain.from_iterable(a)))

This gives

['a', 'b', 'c']

edited Dec 15, 2022 at 13:52

Stefan

1,96222 silver badges38 bronze badges

answered Apr 4, 2009 at 4:11

CTT

17.7k7 gold badges44 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

hasen Over a year ago

no need to list() it! for item in itertools.chain(*a): do somethign with item

hasen Over a year ago

A bit of explanation would also be nice. docs.python.org/library/itertools.html#itertools.chain

temoto Over a year ago

result = []; map(result.extend, a) is ~30% faster than itertools.chain. But chain.from_iterable is a tiny bit faster than map+extend. [Python 2.7, x86_64]

Evgeni Sergeev Over a year ago

This explains what's happening with the *a: stackoverflow.com/questions/5239856/foggy-on-asterisk-in-python (it sends the elements of a as arguments to chain, like removing the outer [ and ]).

Jonathan Hartley Over a year ago

chain.from_iterable is significantly faster if you have many iterables to concatenate. For me it was ~50% faster when creating ctypes arrays of OpenGL vertices from 100s of python lists containing 10s or 100s of vertices each. The '*' operator converts your iterable into an intermediate tuple that it passes in to chain.

|

Christian C. Salvadó · Accepted Answer · 2009-04-04 04:15:15Z

385

x = [["a","b"], ["c"]]

result = sum(x, [])

answered Apr 4, 2009 at 4:15

Christian C. Salvadó

831k185 gold badges929 silver badges845 bronze badges

9 Comments

Aufwind Over a year ago

@Aaron, explain for a noob python learner please: Is O(n^2) good or bad in this case? ;-)

andronikus Over a year ago

O(n^2) here basically means that the time required for this function to execute is proportional to the square of the length of the inputs. So if you double the inputs, you quadruple the time required. This is a Bad Thing if you have large inputs, but for small ones it should be fine. But a faster method will be better.

Requin Over a year ago

Can somebody explain how and why sum(x, []) works?

Robert Moskal Over a year ago

What an awful thing to read! The more I use python the more it's half finished state hits home. I can't imagine why the standard library doesn't have a flat() function built in.

andyhasit Over a year ago

@Requin sum(iterable, start) adds all elements in iterable to start. Adding lists concats them. start defaults to 0, which would break if interable consists of lists as you can't add list to an int, so we pass an empty list as start.

|

George V. Reilly · Accepted Answer · 2009-04-04 08:21:14Z

281

If you're only going one level deep, a nested comprehension will also work:

>>> x = [["a","b"], ["c"]]
>>> [inner
...     for outer in x
...         for inner in outer]
['a', 'b', 'c']

On one line, that becomes:

>>> [j for i in x for j in i]
['a', 'b', 'c']

answered Apr 4, 2009 at 8:21

George V. Reilly

16.4k7 gold badges46 silver badges39 bronze badges

5 Comments

khosrow Over a year ago

Very cool, so for the next depth-level it will become [i for ll in x for l in ll for i in l] - at this point it starts getting a bit lame for the reader, but nevertheless cool :)

George V. Reilly Over a year ago

For three levels, it gets nasty: >>> x = [[["a", "b"], ["c"]], [["d"]]] >>> [k for i in x for j in i for k in j] ['a', 'b', 'c', 'd']

Zach Estela Over a year ago

Listception.. this is definitely unpythonic / against the zen of python in that it is not the simplest or most explicit way to do it. You end up hard coding recursion. Still cool though.

Mr. Lance E Sloan Over a year ago

@ZachEstela, I'm happy to see someone call this unpythonic. It seems like many techniques others like to call pythonic are not easily understood at first glance. Readability is one of the things that makes Python attractive to me. This solution is cool, and probably the fastest, but the sum(x, []) solution is much more Pythonic.

tobltobs Over a year ago

Those "more pythonic" answers are just wrong. The question wasn't about recursive joining, but joining a list of lists, which means there are no more depth levels to join.

Skippy le Grand Gourou · Accepted Answer · 2022-10-18 16:29:33Z

57

flat_list = []
map(flat_list.extend, list_of_lists)

shortest!

edited Oct 18, 2022 at 16:29

Skippy le Grand Gourou

7,8526 gold badges65 silver badges82 bronze badges

answered Jul 24, 2010 at 1:04

nate c

9,0532 gold badges29 silver badges28 bronze badges

6 Comments

recursive Over a year ago

sum(listoflists,[]) # shorter!

Kos Over a year ago

@recursive Shorter but different functionally = much worse performance-wise, see comments on other variants for explanation

Kos Over a year ago

This tiny snippet appears to be the fastest way around for non-recursive flatten. Needs more upvotes.

kit Over a year ago

in Python 3.1+, wrap map withlist(), or else you'll see <map object at 0x0000...> when you print the result

habrewning Over a year ago

This solution does not work for me.

|

culebrón · Accepted Answer · 2018-01-30 12:45:59Z

33

If you need a list, not a generator, use list():

from itertools import chain
x = [["a","b"], ["c"]]
y = list(chain(*x))

edited Jan 30, 2018 at 12:45

answered Mar 18, 2012 at 18:50

culebrón

36.9k21 gold badges78 silver badges117 bronze badges

3 Comments

Kos Over a year ago

s/x/*x/ (or chain.from_iterable(x) preferably)

Val Over a year ago

I do not understand what it does. join is supposed to have a separator.

culebrón Over a year ago

@Val chain makes a generator that will output 'a', 'b', 'c'. list converts it into a list.

Machavity · Accepted Answer · 2021-07-21 17:27:36Z

33

This is known as flattening, and there are a LOT of implementations out there.

How about this, although it will only work for 1 level deep nesting:

>>> x = [["a","b"], ["c"]]
>>> for el in sum(x, []):
...     print el
...
a
b
c

From those links, apparently the most complete-fast-elegant-etc implementation is the following:

def flatten(l, ltypes=(list, tuple)):
    ltype = type(l)
    l = list(l)
    i = 0
    while i < len(l):
        while isinstance(l[i], ltypes):
            if not l[i]:
                l.pop(i)
                i -= 1
                break
            else:
                l[i:i + 1] = l[i]
        i += 1
    return ltype(l)

edited Jul 21, 2021 at 17:27

Machavity♦

31.8k27 gold badges97 silver badges108 bronze badges

answered Apr 4, 2009 at 4:04

Paolo Bergantino

490k83 gold badges523 silver badges437 bronze badges

4 Comments

Aaron Over a year ago

Ah, 'sum(L,I)' is shorthand for 'reduce(plus_operator, L, I)'. That's kinda cool.

hasen Over a year ago

your "most complete-elegant-etc" is not "elegant" at all!! see the docs for itertools.chain to see true elegance!

Brian Over a year ago

@hasen j: I believe he means best for arbitrary nested lists. chain assumes a consistent, one-deep list of lists (which is probably all the question needs), but flatten handles things like [a,b,[c], [d,[e,f]],[[[g]]]].

naught101 Over a year ago

Unfortunately this breaks if you're using pylab, because numpy's sum gets imported into the global namespace, and that function doesn't work that way.

Evgeni Sergeev · Accepted Answer · 2014-01-10 00:48:17Z

30

A performance comparison:

import itertools
import timeit
big_list = [[0]*1000 for i in range(1000)]
timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100)
timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100)
timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100)
timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100)
[100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]

Producing:

>>> import itertools
>>> import timeit
>>> big_list = [[0]*1000 for i in range(1000)]
>>> timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100)
[3.016212113769325, 3.0148865239060227, 3.0126415732791028]
>>> timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100)
[3.019953987082083, 3.528754223385439, 3.02181439266457]
>>> timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100)
[1.812084445152557, 1.7702404451095965, 1.7722977998725362]
>>> timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100)
[5.409658160700605, 5.477502077679354, 5.444318360412744]
>>> [100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]
[399.27587954973444, 400.9240571138051, 403.7521153804846]

This is with Python 2.7.1 on Windows XP 32-bit, but @temoto in the comments above got from_iterable to be faster than map+extend, so it's quite platform and input dependent.

Stay away from sum(big_list, [])

answered Jan 10, 2014 at 0:48

Evgeni Sergeev

23.8k18 gold badges112 silver badges128 bronze badges

2 Comments

Michael Scott Asato Cuthbert Over a year ago

Super helpful. Thanks! Note that in Python3, we need a list() around the map() version, otherwise the results are too good to be true.

Evgeni Sergeev Over a year ago

There's a few downvotes. I can't figure out what they're referring to. If you see a mistake, could you point it out? If there is a mistake, it should be easy to fix, which would be nice for the future generations of visitors.

Adam Harte · Accepted Answer · 2012-09-06 09:25:42Z

16

This works recursively for infinitely nested elements:

def iterFlatten(root):
    if isinstance(root, (list, tuple)):
        for element in root:
            for e in iterFlatten(element):
                yield e
    else:
        yield root

Result:

>>> b = [["a", ("b", "c")], "d"]
>>> list(iterFlatten(b))
['a', 'b', 'c', 'd']

edited Sep 6, 2012 at 9:25

Adam Harte

10.6k7 gold badges55 silver badges90 bronze badges

answered Apr 4, 2009 at 9:20

Georg Schölly

126k54 gold badges225 silver badges277 bronze badges

3 Comments

Casey Kuball Over a year ago

>>> a = [] >>> a.append(a) >>> b = iterFlatten(a) >>> next(b) RuntimeError: maximum recursion depth exceeded in __instancecheck__

:)

Kos Over a year ago

@Darthfett would you expect a meaningful result for flattening an "infinitely-nested list"? :-)

Casey Kuball Over a year ago

@Kos A version that checks for such cases (by using a stack/set to check for self-references in a list) could be preferable than simply continuing to flatten until reaching the recursion depth limit. This could bypass the issue by simply giving the value, instead of trying to flatten it.

Michael Puckett · Accepted Answer · 2010-12-21 03:19:45Z

8

Late to the party but ...

I'm new to python and come from a lisp background. This is what I came up with (check out the var names for lulz):

def flatten(lst):
    if lst:
        car,*cdr=lst
        if isinstance(car,(list,tuple)):
            if cdr: return flatten(car) + flatten(cdr)
            return flatten(car)
        if cdr: return [car] + flatten(cdr)
        return [car]

Seems to work. Test:

flatten((1,2,3,(4,5,6,(7,8,(((1,2)))))))

returns:

[1, 2, 3, 4, 5, 6, 7, 8, 1, 2]

answered Dec 21, 2010 at 3:19

Michael Puckett

4651 gold badge5 silver badges7 bronze badges

4 Comments

Tom Over a year ago

You come from a lisp background? I never would have guessed from the code... haha

Kos Over a year ago

Nice, been doing Python for some time now and I haven't seen var-arg tuple unpacking like you did with car, *cdr. (e-> probably because it's Python 3 and I'm still digging 2 for some reason :-))

naught101 Over a year ago

What's the point of the if lst:?

Sadegh Over a year ago

its 4 a.m. and I was hoping to find something that goes deep like 3-4 list nested, thanks you wrote this since I am not able to code now :D.

Paige Ruten · Accepted Answer · 2009-04-04 04:08:45Z

5

What you're describing is known as flattening a list, and with this new knowledge you'll be able to find many solutions to this on Google (there is no built-in flatten method). Here is one of them, from http://www.daniel-lemire.com/blog/archives/2006/05/10/flattening-lists-in-python/:

def flatten(x):
    flat = True
    ans = []
    for i in x:
        if ( i.__class__ is list):
            ans = flatten(i)
        else:
            ans.append(i)
    return ans

answered Apr 4, 2009 at 4:08

Paige Ruten

177k37 gold badges183 silver badges199 bronze badges

2 Comments

James Haskell Over a year ago

This method works nicely for a mix of lists of strings and strings (e.g. [['some', 'string'], 'and', 'another'] ), while the itertools techniques do not. This works well for my needs.

Sadegh Over a year ago

this method does not work when you have something like this [[[a, b, c], [d, e, f]], [[g, h, l]]]

esmit · Accepted Answer · 2012-10-05 17:47:14Z

4

For one-level flatten, if you care about speed, this is faster than any of the previous answers under all conditions I tried. (That is, if you need the result as a list. If you only need to iterate through it on the fly then the chain example is probably better.) It works by pre-allocating a list of the final size and copying the parts in by slice (which is a lower-level block copy than any of the iterator methods):

def join(a):
    """Joins a sequence of sequences into a single sequence.  (One-level flattening.)
    E.g., join([(1,2,3), [4, 5], [6, (7, 8, 9), 10]]) = [1,2,3,4,5,6,(7,8,9),10]
    This is very efficient, especially when the subsequences are long.
    """
    n = sum([len(b) for b in a])
    l = [None]*n
    i = 0
    for b in a:
        j = i+len(b)
        l[i:j] = b
        i = j
    return l

Sorted times list with comments:

[(0.5391559600830078, 'flatten4b'), # join() above. 
(0.5400412082672119, 'flatten4c'), # Same, with sum(len(b) for b in a) 
(0.5419249534606934, 'flatten4a'), # Similar, using zip() 
(0.7351131439208984, 'flatten1b'), # list(itertools.chain.from_iterable(a)) 
(0.7472689151763916, 'flatten1'), # list(itertools.chain(*a)) 
(1.5468521118164062, 'flatten3'), # [i for j in a for i in j] 
(26.696547985076904, 'flatten2')] # sum(a, [])

edited Oct 5, 2012 at 17:47

esmit

1,80715 silver badges27 bronze badges

answered Jan 29, 2012 at 2:51

Brandyn

4403 silver badges9 bronze badges

5 Comments

esmit Over a year ago

Can you add timings to confirm that this is faster than the other methods presented?

Brandyn Over a year ago

Sorted times list with comments:

[(0.5391559600830078, 'flatten4b'), # join() above. (0.5400412082672119, 'flatten4c'), # Same, with sum(len(b) for b in a) (0.5419249534606934, 'flatten4a'), # Similar, using zip() (0.7351131439208984, 'flatten1b'), # list(itertools.chain.from_iterable(a)) (0.7472689151763916, 'flatten1'), # list(itertools.chain(*a)) (1.5468521118164062, 'flatten3'), # [i for j in a for i in j] (26.696547985076904, 'flatten2')] # sum(a, [])

Kos Over a year ago

You skipped map(result.extend, a)

Kos Over a year ago

There's a benchmark ideone.com/9q3mrp

Brandyn Over a year ago

@Kos, you are right! I'm lame. I probably omitted it originally because it "obviously" has bad O() time due to multiple copies, but now that I add it to my test, in practice it looks like it is successfully using realloc() to avoid this, and so it is winning hands down under all conditions. I remain skeptical, though, that it might revert to horrible behavior in a real working environment with fragmented memory. In a simple test app like this, with a clean slate of memory, it is free to keep extending the array without moving it. Thoughts?

Aaron · Accepted Answer · 2009-04-04 04:16:16Z

3

There's always reduce (being deprecated to functools):

>>> x = [ [ 'a', 'b'], ['c'] ]
>>> for el in reduce(lambda a,b: a+b, x, []):
...  print el
...
__main__:1: DeprecationWarning: reduce() not supported in 3.x; use functools.reduce()
a
b
c
>>> import functools
>>> for el in functools.reduce(lambda a,b: a+b, x, []):
...   print el
...
a
b
c
>>>

Unfortunately the plus operator for list concatenation can't be used as a function -- or fortunate, if you prefer lambdas to be ugly for improved visibility.

answered Apr 4, 2009 at 4:16

Aaron

3,46426 silver badges26 bronze badges

4 Comments

Benson Over a year ago

GAH, I cannot believe they are deprecating it to functools. Anyway, you don't need the extra empty list, this will work just fine: reduce(lambda a,b: a+b, x)

Brian Over a year ago

Versions of the operators are defined as functions in the operator module, which are faster and less ugly than the lambda: "functools.reduce(operator.add, [[1,2,3],[4,5]],[])". Alternatively, just use sum()

Benson Over a year ago

Personally, I think the lambda way is quite pretty. :-)

Kos Over a year ago

If you want to do a reduce, then reduce over extend not add to avoid spamming the memory with temporary lists. Wrap extend with a function that extends then returns the list itself.

svhb · Accepted Answer · 2010-01-02 16:42:13Z

2

Or a recursive operation:

def flatten(input):
    ret = []
    if not isinstance(input, (list, tuple)):
        return [input]
    for i in input:
        if isinstance(i, (list, tuple)):
            ret.extend(flatten(i))
        else:
            ret.append(i)
    return ret

answered Jan 2, 2010 at 16:42

svhb

211 bronze badge

Comments

Don Werve · Accepted Answer · 2009-04-04 04:11:30Z

1

Sadly, Python doesn't have a simple way to flatten lists. Try this:

def flatten(some_list):
    for element in some_list:
        if type(element) in (tuple, list):
            for item in flatten(element):
                yield item
        else:
            yield element

Which will recursively flatten a list; you can then do

result = []
[ result.extend(el) for el in x] 

for el in flatten(result):
      print el

answered Apr 4, 2009 at 4:11

Don Werve

5,1182 gold badges29 silver badges33 bronze badges

Comments

Karthik Gomadam · Accepted Answer · 2012-07-31 21:11:24Z

I had a similar problem when I had to create a dictionary that contained the elements of an array and their count. The answer is relevant because, I flatten a list of lists, get the elements I need and then do a group and count. I used Python's map function to produce a tuple of element and it's count and groupby over the array. Note that the groupby takes the array element itself as the keyfunc. As a relatively new Python coder, I find it to me more easier to comprehend, while being Pythonic as well.

Before I discuss the code, here is a sample of data I had to flatten first:

{ "_id" : ObjectId("4fe3a90783157d765d000011"), "status" : [ "opencalais" ],
  "content_length" : 688, "open_calais_extract" : { "entities" : [
  {"type" :"Person","name" : "Iman Samdura","rel_score" : 0.223 }, 
  {"type" : "Company",  "name" : "Associated Press",    "rel_score" : 0.321 },          
  {"type" : "Country",  "name" : "Indonesia",   "rel_score" : 0.321 }, ... ]},
  "title" : "Indonesia Police Arrest Bali Bomb Planner", "time" : "06:42  ET",         
  "filename" : "021121bn.01", "month" : "November", "utctime" : 1037836800,
  "date" : "November 21, 2002", "news_type" : "bn", "day" : "21" }

It is a query result from Mongo. The code below flattens a collection of such lists.

def flatten_list(items):
  return sorted([entity['name'] for entity in [entities for sublist in  
   [item['open_calais_extract']['entities'] for item in items] 
   for entities in sublist])

First, I would extract all the "entities" collection, and then for each entities collection, iterate over the dictionary and extract the name attribute.

Collectives™ on Stack Overflow

join list of lists in python [duplicate]

15 Answers 15

9 Comments

9 Comments

5 Comments

6 Comments

3 Comments

4 Comments

2 Comments

3 Comments

4 Comments

2 Comments

5 Comments

4 Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

15 Answers 15

9 Comments

9 Comments

5 Comments

6 Comments

3 Comments

4 Comments

2 Comments

3 Comments

4 Comments

2 Comments

5 Comments

4 Comments

Comments

Comments

Comments

Linked

Related