9

I'm learning python recently, and is doing many practice with the language.

One thing I found interesting is that, when I read from an array, it's almost half of the time slower than list. Does somebody know why?

here's my code:

from timeit import Timer
import array

t = 10000
l = range(t)
a = array.array('i', l)
def LIST():
    for i in xrange(t):
        l[i]

def ARRAY():
    for i in xrange(t):
        a[i]

print Timer(LIST).timeit(1000);
print Timer(ARRAY).timeit(1000);

the output is:

0.813191890717
1.16269612312

which indicates that reading array is slower than list. I think array is a fixed size memory, while list is a dynamic structure. So I assumed that array would be faster than list.

Does anyone has any explanation?

4
  • 1
    possible dupe/answer: stackoverflow.com/questions/176011/… - Basically array.array is a wrapper around a C array so I think there is overhead when accessing it. Don't use it for math. Commented Jul 17, 2010 at 14:16
  • Trying to second-guess Python efficiencies - especially for those coming from a C-like background - is often counter-intuitive. Code clearly first, then optimize if you measure a performance problem; this applies to C also, but because the language elements are so close to the machine people often forget. Commented Jul 17, 2010 at 14:31
  • 1
    For math you may want to use numpy (not available for python 3 yet), only god knows why numpy is not standard library. Commented Jul 17, 2010 at 14:32
  • 1
    numpy is very close to available on Python 3: mail-archive.com/[email protected]/msg26524.html Commented Jul 17, 2010 at 14:40

3 Answers 3

11

lists are "dynamically growing vectors" (very much like C++'s std::vector, say) but that in no way slows down random access to them (they're not linked lists!-). Lists' entries are references to Python objects (the items): accessing one just requires (in CPython) an increment of the item's reference count (in other implementations, based on more advanced garbage collection, not even that;-). Array's entries are raw bits and bytes: accessing one requires a fresh new Python object to be synthesized based on that binary value. So e.g.:

$ python -mtimeit -s'import array; c=array.array("B", "bzap")' 'c[2]'
10000000 loops, best of 3: 0.0903 usec per loop
$ python -mtimeit -s'c=list("bzap")' 'c[2]'
10000000 loops, best of 3: 0.0601 usec per loop

30 nanoseconds' extra time for an access doesn't seem too bad;-).

As an aside, note that timeit is MUCH nicer to use from the command line -- automatic choice of repetition, unit of measure shown for the time, etc. That's how I always use it (importing a custom-coded module with functions to be called, if needed -- but here there is no need for that) -- it's so much handier than importing and using it from a module!

Sign up to request clarification or add additional context in comments.

Comments

7

It takes time to wrap a raw integer into a Python int.

1 Comment

Quite so. The function LIST just has to increment and then decrement the reference count for each list element. The ARRAY function on the other hand has to allocate memory for each of the integers (except the smaller ones which have an optimisation) and then free it again.
1

The Python lists really resemble some ways normal arrays, they are not Lisp lists, but they have fast random access.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.