Python 2D list performance, without numpy

Question

I am trying to create a 2D list in Python. I found two possibilities.

def cArray(size):
    c = [[0. for i in range(size)] for j in range(size)]
    return c

def npArray(size):
    np = numpy.zeros((size,size))
    return np

Now both of these functions give the correct answer. The problem here is with the performance. I ran both of these using timeit, and here are my results:

list size is 5000
number of times run is 5

cArray average time: 3.73241295815
npArray average time: 0.210782241821

So obviously, I would like to avoid the first solution, especially since this will be running for sizes up to 100k. However, I also do not want to use too many dependencies. Is there a way for me to efficiently create a 2D array, without numpy? It doesn't need to be exactly up to speed, as long as it's not 17 times slower.

If this shocks you, try measuring memory consumption too. It may not be 17x larger but it definitely will be awful. — user395760
– user395760, Commented Aug 22, 2012 at 18:47
ctypes could help, but surely not as much as numpy see stackoverflow.com/questions/4145775/… this question — unddoch
– unddoch, Commented Aug 22, 2012 at 18:50
.. sizes up to 100k? Then I hope your array either (1) isn't square despite your example being square, or (2) is very sparse, because that's a lot of cells to keep track of otherwise. — DSM
– DSM, Commented Aug 22, 2012 at 18:59
I really think that this question cannot be truly answered without knowing what is done to the data. There is also import array which is limited to 1D, but a list of arrays is possible. To throw it in [[0]*size for _ in xrange(size)], or array('d', [0]) instaed of [0] (which is type aware and thus takes much less space), eliminates one loop. — seberg
– seberg, Commented Aug 22, 2012 at 19:39
@Sebastian: this is for an implementation of a sequence alignment algorithm — Phil
– Phil, Commented Aug 22, 2012 at 20:16

BrenBarn · Accepted Answer · 2012-08-22 18:57:40Z

So obviously, I would like to avoid the first solution, especially since this will be running for sizes up to 100k. However, I also do not want to use too many dependencies.

You must choose which of these is more important to you. Numpy has better performance precisely because it doesn't use the builtin Python types and uses its own types that are optimized for numerical work. If your data are going to be numeric and you're going to have 100k rows/columns, you will see a gigantic performance increase with numpy. If you want to avoid the numpy dependency, you will have to live with reduced performance. (Obviously you can always write your own Python libraries or C extensions to optimize for your particular use case, but these will then be dependencies like any other.)

Personally I would recommend you just use numpy. It is so widely used that anyone who is considering using a library that deals with 100k multidimensional arrays probably already has numpy installed.

Roland Smith · Accepted Answer · 2012-08-22 19:49:08Z

4

I tried a couple of alternatives;

Edit: The original eArray was faulty, it created references to the same list...

Edit2: Added array.array as suggested by Sebastian.

import time
import numpy as np
import array

t1 = 0

def starttimer():
    global t1
    t1 = time.clock()

def stoptimer(s):
    t2 = time.clock()
    print 'elapsed time for "{}": {:.3f} seconds'.format(s, t2-t1)

def cArray(size):
    c = [[0. for i in range(size)] for j in range(size)]
    return c

def dArray(size):
    d = [[0. for i in xrange(size)] for j in xrange(size)]
    return d

def eArray2(size):
    return [[0.]*size for j in xrange(size)]

def fArray(size):
    return np.zeros((size,size))

def gArray(size):
    return [array.array('d', [0])*size for j in xrange(size)]

sz = 5000

starttimer()
cArray(sz)
stoptimer('cArray')

starttimer()
dArray(sz)
stoptimer('dArray')

starttimer()
fArray(sz)
stoptimer('fArray')

starttimer()
gArray(sz)
stoptimer('gArray')

The results (cpython 2.7.3 on FreeBSD amd64, if anyone cares):

> python tmp/test.py
elapsed time for "cArray": 2.312 seconds
elapsed time for "dArray": 1.945 seconds
elapsed time for "eArray2": 0.680 seconds
elapsed time for "fArray": 0.180 seconds
elapsed time for "gArray": 0.695 seconds
> python tmp/test.py
elapsed time for "cArray": 2.312 seconds
elapsed time for "dArray": 1.914 seconds
elapsed time for "eArray2": 0.680 seconds
elapsed time for "fArray": 0.180 seconds
elapsed time for "gArray": 0.695 seconds
> python tmp/test.py
elapsed time for "cArray": 2.328 seconds
elapsed time for "dArray": 1.906 seconds
elapsed time for "eArray2": 0.680 seconds
elapsed time for "fArray": 0.180 seconds
elapsed time for "gArray": 0.703 seconds

edited Aug 22, 2012 at 19:49

answered Aug 22, 2012 at 18:49

Roland Smith

43.8k3 gold badges69 silver badges98 bronze badges

4 Comments

DSM Over a year ago

This timing code won't work: t1 = time.clock() sets a local variable. That's why d and e have the same time when e should be much faster. If you fix this (e.g. global t1 in starttimer) you get more plausible results. It's usually easier to use the timeit module. As an aside, eArray isn't really a fair comparison because it doesn't actually make 5000^2 cells.

Joe Kington Over a year ago

Also keep in mind that they don't do the same thing. For example, try changing an element of your eArray and look at the result.

Roland Smith Over a year ago

@JoeKington I've changed eArray. I think I realized where it went wrong. :-/

seberg Over a year ago

since you are at it, maybe add import array, array.array('d', [0])*size instead of [0]*size for eArray.

ecatmur · Accepted Answer · 2012-08-22 18:51:19Z

2

If you want to go really low level, you can use ctypes:

a = (ctypes.c_int * 5000 * 5000)()

But NumPy is available on the majority of platforms where Python runs.

answered Aug 22, 2012 at 18:51

ecatmur

158k28 gold badges311 silver badges387 bronze badges

Comments

ZornLemma · Accepted Answer · 2018-11-11 10:37:21Z

My use-case is that at some online judge platform, the library dependency is limited, but there is need for 2D array when doing dynamical programming (also hard to vectorize). My python codes there often get Time Limit Exceeded.

There are things to reminder:

Although python list is a array of pointers, the naive objects are very quick.
Using compact structure like numpy maybe fast when creating object, but accessing elements will cost extra overhead unlike naive python object,
unless some JIT things like Numba are used.

Test code with a toy DP example:

import timeit

size = 1000

range_init_0 = "size={}".format(size)
range_init_1 = "a = [[0 for i in range(size)] for j in range(size)]"

multi_init_0 = "size={}".format(size)
multi_init_1 = "a = [[0]*size for _ in range(size)]"

numpy_init_0 = "from numpy import zeros; size={}".format(size)
numpy_init_1 = "a = zeros((size, size), dtype=int)"

array_init_0 = "from array import array; size={}".format(size)
array_init_1 = "a = [array('d', [0])*size for j in range(size)]"

ctyps_init_0 = "from ctypes import c_int; size={}".format(size)
ctypes_init_1 = "a = (c_int * size * size)()"

dp = '''
MOD = int(1e9+7)
for i in range(size):
    a[i][0] = 1
for j in range(size):
    a[0][i] = 1
for i in range(1, size):
    for j in range(1, size):
        a[i][j] = (a[i][j] + a[i-1][j] + a[i][j-1]) % MOD
'''

def test(name, init_0, init_1, dp, n=10):
    t = timeit.timeit(init_1, setup=init_0, number=n)
    print("{} initial time:\t{:.8f}".format(name, t))
    t = timeit.timeit(dp, setup=init_0 + '\n' + init_1, number=n)
    print("{} calculate time:\t{:.8f}".format(name, t))

test("range", range_init_0, range_init_1, dp)
test("multi", multi_init_0, multi_init_1, dp)
test("numpy", numpy_init_0, numpy_init_1, dp)
test("array", array_init_0, array_init_1, dp)
test("ctypes", ctyps_init_0, ctypes_init_1, dp)

print('------')
numba_init_0 = '''
import numpy as np
size = {}
a = np.zeros((size, size), dtype=np.int32)
'''.format(size)
numba_init_1 = '''
import numba
def dp1(a):
    size = len(a)
    MOD = int(1e9+7)
    for i in range(size):
        a[i][0] = 1
    for j in range(size):
        a[0][i] = 1
    for i in range(1, size):
        for j in range(1, size):
            a[i][j] = (a[i][j] + a[i-1][j] + a[i][j-1]) % MOD
dp_jit = numba.jit('void(i4[:,:])')(dp1)
'''
dp = "dp_jit(a)"

test("numba", numba_init_0, numba_init_1, dp)

Results:

range initial time:     0.56781153
range calculate time:   5.08359793
multi initial time:     0.03682878
multi calculate time:   5.14657282
numpy initial time:     0.00883761
numpy calculate time:   12.15619322
array initial time:     0.02656035
array calculate time:   5.27542352
ctypes initial time:    0.00523795
ctypes calculate time:  7.88469346
------
numba initial time:     2.98394509
numba calculate time:   0.05321887

(Numba initialization time here doesn't include numpy initialization)

As we can see, both numpy and ctypes are slower than native lists when computing.

Numba JIT costs some time, but the computation time is significantly shorter.

(And don't use Python for 2D dynamical programming at Online Judgement Platform!)

Collectives™ on Stack Overflow

Python 2D list performance, without numpy

4 Answers 4

Comments

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related