0

Good evening!

I was running some tests on lists and list creation vs iterator creation and I came across some staggering time differences. Observe the following:

>>> timeit.timeit('map(lambda x: x**3, [1, 2, 3, 4, 5])')
0.4515998857965542
>>> timeit.timeit('list(map(lambda x: x**3, [1, 2, 3, 4, 5]))')
2.868906182460819

The iterator version returned by the first test runs more than 6x as fast as converting to a list. I understand basically why this might be occurring, but what I'm more interested in is a solution. Does anyone know of a data structure similar to a list that offers fast creation time? (Basically, I want to know if there is a way to go straight from iterator (i.e. map or filter function, etc.), to a list without any major performance hits)

Things I can sacrifice for speed:

  1. Appending, inserting, popping and deleting elements.

  2. Slicing of elements.

  3. Reversing the list or any inplace operators like sort.

  4. Contains (in) operator.

  5. Concatenation and multiplication.

All suggestions are welcome thanks!

EDIT: Indeed this is for python 3.

3
  • 1
    Indubitably, this is python 3. Commented Apr 12, 2014 at 4:07
  • 3
    .. but if this is Python 3, your first example doesn't create a list in the first place. So you're doing math in the second that you're not in the first. Commented Apr 12, 2014 at 4:09
  • If you want performance, you could use numpy for this. Commented Apr 12, 2014 at 4:12

3 Answers 3

4

In Python 3.x, map doesn't create a list, but just an iterator, unlike Python 2.x.

print(type(map(lambda x: x**3, [1, 2, 3, 4, 5])))
# <class 'map'>

To really get a list, iterate it with the list function, like this

print(type(list(map(lambda x: x**3, [1, 2, 3, 4, 5]))))
# <class 'list'>

So, you are really not comparing two similar things.

Sign up to request clarification or add additional context in comments.

Comments

4

Expanding on thefourtheye's answer; The expressions inside the map function will not be evaluated before you iterate over it. This example should be pretty clear:

from time import sleep

def badass_heavy_function():
    sleep(3600)

# Method call isn't evaluated
foo = map(lambda x: x(), [badass_heavy_function, badass_heavy_function])

# Methods call will be evaluated, please wait 2 hours
bar = list(map(lambda x: x(), [badass_heavy_function, badass_heavy_function]))

for _ in foo:
    # Please wait one hour
    pass

Comments

2

To further extend the answers of the other two guys:

You had a misconception about the iterator. But you refer to as "slow creation time", and then you look for a "faster container", because of your misinterpretation.

Note that the creation of a list object in python is fast:

%timeit list(range(10000))
10000 loops, best of 3: 164 µs per loop

What you experience as slow is the actual loop that you need to do calculate the values that need to go into the list.

see a very unoptimized example of slowly "creating" a new list of another list:

x = list(range(10000))

def slow_loop(x):
    new = []
    for i in x:
        new.append(i**2)
    return new
%timeit slow_loop(x)
100 loops, best of 3: 4.17 ms per loop

the time that is spent is actually on the loop, that is "slow" in python.

This is actually what you are doing here technically if you compare:

def your_loop(x):
    return list(map(lambda y: y**2, x))

%timeit your_loop(x)
100 loops, best of 3: 4.5 ms per loop

There is a way to speed this up though:

def faster_loop(x):
    return [i**2 for i in x]


%timeit faster_loop(x)
100 loops, best of 3: 3.67 ms per loop

although not by much given this kind of function. The thing is: the slow part here is the math, not the list and not the container. You can prove this by using numpy

arr = np.array(x)

%timeit arr ** 2
100000 loops, best of 3: 7.44 µs per loop

Woah... crazy speedup.

With the benchmarking - I find myself guilty of this quite often as well - people doubt the system too often but themselves not often enough. So it's not like python is very unoptimized or "slow" it's just that you're doing it wrong. Don't doubt the python list efficiency. Doubt your slow, inefficient code. You will probably get it right quicker...

It seems here the pure python ** operator is very slow, as a simple multiplication is much quicker:

def faster_loop2(x):
    return [i * i for i in x]

%timeit faster_loop2(x)
1000 loops, best of 3: 534 µs per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.