1

I'm studying algorithms and data structures at the moment.

I thought I would run a quick timeit.timeit test for iterating through a list of 2**30 random integers in a list() comparing to the same for the array.array format.

I was expecting the array to finish first as one of the few muted benefits I have seen on other posts with a Python array is performance (I was initially wrongly under the impression that the list was implemented as a linked list: thank you for the correction Duncan)

Surely an array should be at least as quick as a list?

import os
import array
l = list(os.urandom(2**30))
a = array.array('I', l)

def test_list():
 for i in l:
  pass

def test_array():
 for i in a:
  pass

>>> timeit.timeit(test_array, number=5)
50.08525877200009
>>> timeit.timeit(test_list, number=5)
37.00491460799958

Here's my platform information: Python 3.6.5, [GCC 7.3.0] on linux x86_64 (Intel i5 4660)

1
  • 5
    Are we supposed to make wild guesses about what test_array and test_list do, or would you like to tell us? And no, Python lists are not linked lists. I don't know where you got that from but they are stored internally as an array. Commented May 3, 2018 at 10:52

1 Answer 1

4

First you initialise l to a list of 2**30 Python int values.

Second you initialise a from the list to create a list of 2**30 C integers.

test_list iterates over the list of Python int values. No Python objects are created or destroyed in this process, just a reference counter on each one gets incremented and then decremented.

test_array iterates over the list of C integers creating a new Python int for each element and then destroying it again. That's why the array is slower: it creates and destroys 2**30 Python objects.

Internally a Python list is just an array of pointers to the objects it contains. That means iterating over the list is as simple and as fast as iterating over an array. The array type here will be using less memory overall (or it would be if you hadn't held on to the list) as C integers are much smaller than Python objects, but each access into the array has to convert the C value into a Python object and while object creation is heavily optimised it still takes more time than just getting another reference to an existing object.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.