Indexing numpy array as slow as python append

Question

I have always been told that python's native append is a slow function and should be avoided in for loops. However after a couple of small tests I have found it performs more poorly than a numpy array when iterating over it with a for loop:

First Test -array/list construction

python native list append

def pythonAppend(n):
      x = []
      for i in range(n):
          x.append(i)
      return x

%timeit pythonAppend(1000000)

Numpy allocate array then access

 def  numpyConstruct(n):
     x = np.zeros(n)
     for i in range(n):
         x[i] = i
     return x

%timeit numpyConstruct(1000000)

Results:

Python Time: 179 ms

Numpy Time: 189 ms

Second Test -Accesing Elements

n = 1000000
x = pythonAppend(n)
arr = numpyConstruct(n)
order = arr[:]; np.random.shuffle(order); order = list(order.astype(int))

def listAccess(x):
     for i in range(len(x)):
             x[i] = i/3.
     return x

def listAccessOrder(x, order):
     for i in order:
             x[i] = i/3.
     return x

%timeit listAccess(x)
%timeit listAccess(arr)
%timeit listAccessOrder(x, order)
%timeit listAccessOrder(arr, order)
%timeit arr / 3.

Results

python -sequential: 175 ms

numpy -sequential: 198 ms

python -shuffled access: 2.08 s

numpy -shuffled access: 2.15 s

numpy -vectorised access: 1.92ms

These results to me are quite surprising as I would have thought that at least due to python being linked lists accessing elements would be slower than numpy due to having to follow a chain of pointers. Or am I misunderstanding the python list implementation? Also why do the python lists perform slightly better than the numpy equivalents -I guess most of the inefficiency comes from using a python for loop but python's append still out-competes numpy accessing and allocating.

I guess when you loose numpy in a loop you loss all the performance advantages. But if you can write it as an vector function its going to be faster then looping with clean python. — some_name.py
– some_name.py, Commented Nov 28, 2019 at 12:32
I thought numpy arrays were sequential? So accessing their elements (especially in a random order) would be faster as the CPU can grab the chunk of memory straight away rather than having to follow a sequence of pointers to the element? — Matt Ellis
– Matt Ellis, Commented Nov 28, 2019 at 12:44

hpaulj · Accepted Answer · 2019-11-28 17:20:56Z

Python lists are not linked lists. The data buffer of the list object contains links/pointers to objects (else where in memory). So fetching the ith element is easy. The buffer also has growth headroom, so append is a simple matter of inserting the new element's link. The buffer has to be reallocated periodically, but Python manages that seamlessly.

A numpy array also has a 1d data buffer, but it contains numeric values (more generally the bytes as required by the dtype). Fetching is easy, but it has to create a new python object to 'contain' the value ('unboxing'). Assignment also requires conversion from the python object to bytes that will be stored.

Generally we've found that making a new array by appending to a list (with one np.array call at the end) is competitive with assignment to a preallocated array.

Iteration through a numpy array is normally slower than iterating through a list.

What we strongly discourage is using np.append (or some variant) to grow an array iteratively. That makes a new array each time, with full copy.

numpy arrays are fast when the iteration is done in the compiled code. Normally that's with the whole-array methods. The c code can iterate over the data buffer, operating on the values directly, rather than invoking python object methods at each step.

Collectives™ on Stack Overflow

Indexing numpy array as slow as python append

First Test -array/list construction

python native list append

Numpy allocate array then access

Results:

Second Test -Accesing Elements

Results

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

First Test -array/list construction

python native list append

Numpy allocate array then access

Results:

Second Test -Accesing Elements

Results

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest