141

What is the fastest and most elegant way of doing list of lists from two lists?

I have

In [1]: a=[1,2,3,4,5,6]

In [2]: b=[7,8,9,10,11,12]

In [3]: zip(a,b)
Out[3]: [(1, 7), (2, 8), (3, 9), (4, 10), (5, 11), (6, 12)]

And I'd like to have

In [3]: some_method(a,b)
Out[3]: [[1, 7], [2, 8], [3, 9], [4, 10], [5, 11], [6, 12]]

I was thinking about using map instead of zip, but I don't know if there is some standard library method to put as a first argument.

I can def my own function for this, and use map, my question is if there is already implemented something. No is also an answer.

3
  • 1
    Well, do you really need lists? What are you going to do with the results? Commented Dec 4, 2011 at 2:56
  • 19
    An example would be sklearn, where many times data must be organized in this fashion. Commented Dec 1, 2013 at 8:15
  • 1
    @KarlKnechtel Yes he [/we] need a list Commented Aug 27, 2022 at 5:00

7 Answers 7

144

If you are zipping more than 2 lists (or even only 2, for that matter), a readable way would be:

[list(a) for a in zip([1,2,3], [4,5,6], [7,8,9])]

This uses a list comprehension to apply list to each element (tuple) in the list, converting them into lists.

Sign up to request clarification or add additional context in comments.

Comments

91

You almost had the answer yourself. Don't use map instead of zip. Use map AND zip.

You can use map along with zip for an elegant, functional approach:

list(map(list, zip(a, b)))

zip returns a list of tuples. map(list, [...]) calls list on each tuple in the list. list(map([...]) turns the map object into a readable list.

3 Comments

the unfortunate decision to make python 3 collections operations return a generator imposes the cost of the double list in here.
@WestCoastProjects: 1) It returns an iterator (generators are a specific type of iterator that enables coroutines) 2) That's not an unfortunate decision. In the majority of cases you want to iterate the result; making these functions produce a list of results eagerly means that you can't process the first result until you've processed all of them, you can't break the loop early and skip producing the rest of the results, and if the inputs are large but lazy themselves (e.g. file objects, itertools functions) you'll blow your RAM realizing all results when you don't need to.
Adding list() around map in the minority case is a small price to pay for the benefits realized all the times you don't need the wrapping (or when you do, but you don't want a plain unsorted list; sorted(map(...)), set(map(...)), etc. all involved a pointless temporary list in Py2 which Py3 avoids).
16

I love the elegance of the zip function, but using the itemgetter() function in the operator module appears to be much faster. I wrote a simple script to test this:

import time
from operator import itemgetter

list1 = list()
list2 = list()
origlist = list()
for i in range (1,5000000):
        t = (i, 2*i)
        origlist.append(t)

print "Using zip"
starttime = time.time()
list1, list2 = map(list, zip(*origlist))
elapsed = time.time()-starttime
print elapsed

print "Using itemgetter"
starttime = time.time()
list1 = map(itemgetter(0),origlist)
list2 = map(itemgetter(1),origlist)
elapsed = time.time()-starttime
print elapsed

I expected zip to be faster, but the itemgetter method wins by a long shot:

Using zip
6.1550450325
Using itemgetter
0.768098831177

6 Comments

This is a transpose of what the OP is trying to do. Could you update your post to reflect that? I.e., OP is converting two lists to list or arbitrary number of pairs. You are converting an arbitrary number of pairs to a pair of lists.
Which python version is this measured with?
I don't remember, it was over two years ago, but most likely 2.6 or 2.7. I imagine you can copy the code and try it on your own version/platform.
python 2 zip creates a real list. That slows things down. Try replacing zip with itertools.izip then.
@EliasStrehle: In Python 3.5, map is lazy; it didn't do any real work for case #2 (just created the map object w/o mapping anything from the inputs). If you make sure it produces a list, e.g. list(map(itemgetter(0),origlist)), and use a proper microbenchmarking method like the timeit module/IPython's %timeit magic, the times are much more similar. On 3.10, %timeit list1, list2 = map(list, zip(*origlist)) is around 600 ms per loop, %timeit list1, list2 = list(map(itemgetter(0), origlist)), list(map(itemgetter(1), origlist)) is ~575 ms. Faster, but irrelevantly so.
|
8

How about this?

>>> def list_(*args): return list(args)

>>> map(list_, range(5), range(9,4,-1))
[[0, 9], [1, 8], [2, 7], [3, 6], [4, 5]]

Or even better:

>>> def zip_(*args): return map(list_, *args)
>>> zip_(range(5), range(9,4,-1))
[[0, 9], [1, 8], [2, 7], [3, 6], [4, 5]]

Update for Python 3: In Python 3 map returns an iterator and not a list. This is the fastest from a few options I've tested (timed using the timeit module):

[list(t) for t in zip(*lists)]

Update for 3.12 The fastest way so far right now seems to be

[[*t] for t in zip(a, b)]

1 Comment

That seems to me a better answer than the rest as here we are reducing one step by not doing a zip and directly creating a list. Awesome
5

I generally don't like using lambda, but...

>>> a = [1, 2, 3, 4, 5]
>>> b = [6, 7, 8, 9, 10]
>>> c = lambda a, b: [list(c) for c in zip(a, b)]
>>> c(a, b)
[[1, 6], [2, 7], [3, 8], [4, 9], [5, 10]]

If you need the extra speed, map is slightly faster:

>>> d = lambda a, b: map(list, zip(a, b))
>>> d(a, b)
[[1, 6], [2, 7], [3, 8], [4, 9], [5, 10]]

However, map is considered unpythonic and should only be used for performance tuning.

4 Comments

What does lambda add here? One can just write the expression instead of calling a function (it's really not complicated), and even if one wants a function for it, it can be defined painlessly in two lines (one if your return key is broken or you're insane). map on the other hand is perfectly fine if the first argument would be a plain function (as opposed to a lambda).
Well he asked for a function. But I agree-- probably better just to pay the extra line. As for map, I believe list comprehensions are almost always clearer.
I would recommend map over lambda. so map(list, zip(a,b)). List comprehensions may be a little clearer, but map should be faster (untested)
I mean, again, if the OP needs speed, map is the way to go. But in general, and in Python especially, emphasize readability over speed (else you dip into premature optimization).
5

List comprehension would be very simple solution I guess.

a=[1,2,3,4,5,6]

b=[7,8,9,10,11,12]

x = [[i, j] for i, j in zip(a,b)]

print(x)

output : [[1, 7], [2, 8], [3, 9], [4, 10], [5, 11], [6, 12]]

Comments

3

Using numpy

The definition of elegance can be quite questionable but if you are working with numpy the creation of an array and its conversion to list (if needed...) could be very practical even though not so efficient compared to using the map function or the list comprehension.

import numpy as np 
a = b = range(10)
zipped = zip(a,b)
# result = np.array(zipped).tolist() Python 2.7
result = np.array(list(zipped)).tolist()
Out: [[0, 0],
 [1, 1],
 [2, 2],
 [3, 3],
 [4, 4],
 [5, 5],
 [6, 6],
 [7, 7],
 [8, 8],
 [9, 9]]

Otherwise skipping the zip function you can use directly np.dstack:

np.dstack((a,b))[0].tolist()

3 Comments

The first example does not work for me, np.array(zipped) is a array(<class 'zip'>, dtype=object), putting it into a list just return a zip
however np.array(list(zipped)).tolist() will work
@JeanBouvattier thanks for your comment, yes this is because in Python 3 zip is no more a list but a zip object

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.