Performance of list creation in Python

Question

What is the performance overhead of different methods of creating and filling lists in Python?

It's certainly encouraged to write self-answered questions on SO, but the question should look like a normal question. — PM 2Ring
– PM 2Ring, Commented Jan 15, 2018 at 13:33

questiondude · Accepted Answer · 2018-01-15 13:49:26Z

4

The following is intended to measure the overhead of different methods of creating and filling lists in Python. In a real program you would of course do something more meaningful with the data added to the lists.

To test this, I made a bunch of files called test1.py, test2.py, etc. that create and fill a list in different ways. I then ran them with the timeit module:

python -m timeit "from test1 import foo; foo()"

This was tested with Python 3.6.0 on a laptop PC with a 2.4 GHz CPU.

The results are shown best to worst:

Range (237 msec per loop)

def foo():
    a = list(range(10000000))

A question was asked in the comments about the performance of this. Note that this is only useful for filling a list with sequential numbers.

List comprehension (380 msec per loop)

def foo():
    a = [i for i in range(10000000)]

Pre-allocated list (492 msec per loop)

def foo():
    k = 10000000
    a = [0] * k
    for i in range(k):
        a[i] = i

This pre-allocates the entire list so the for-loop merely fills it without calling list-append. Although the append-function has amortized constant computational time, it is not completely free, because the list has to be grown periodically as it becomes full, which requires allocating new memory and copying the contents of the list. Pre-allocating the list avoids the expense of growing the list.

Generator comprehension (573 msec per loop)

def foo():
    a = list((i for i in range(10000000)))

Generator with yield-function (580 msec per loop)

def foo():
    def bar():
        for i in range(10000000):
            yield i

    a = list(bar())

There is a degree of uncertainty in these time measurements and the two generators seemed to use roughly the same amount of time.

List append (827 msec per loop)

def foo():
    a = []
    for i in range(10000000):
        a.append(i)

edited Jan 15, 2018 at 13:49

answered Jan 15, 2018 at 13:21

questiondude

8229 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

AndreyF Over a year ago

How about just using range(10000000)?

questiondude Over a year ago

In an actual program you would of course do something more meaningful with the elements of the list. This just tests the overhead of different methods of creating and filling the list. (And if we are nitpicking, you would have to do list(range(10000000)) to actually create a list in your counter-example :-)

PM 2Ring Over a year ago

What speed do you get for list(range(10000000))? Another option: a = [0] * k; a[:] = range(k)

PM 2Ring Over a year ago

@AndreyF Sure, that's fine, if you just need an iterator, and not an actual list. Unless you're talking about Python 2 range, in which case you need to migrate to Python 3. :)

Chris_Rands Over a year ago

Yes, you should include list(range()) in your times, this is actually the idiomatic way of doing this!

|

Collectives™ on Stack Overflow

Performance of list creation in Python

1 Answer 1

Range (237 msec per loop)

List comprehension (380 msec per loop)

Pre-allocated list (492 msec per loop)

Generator comprehension (573 msec per loop)

Generator with yield-function (580 msec per loop)

List append (827 msec per loop)

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Range (237 msec per loop)

List comprehension (380 msec per loop)

Pre-allocated list (492 msec per loop)

Generator comprehension (573 msec per loop)

Generator with yield-function (580 msec per loop)

List append (827 msec per loop)

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related