for huge arrays is numpy slower than list? [duplicate]

Question

check my following code; it is part of sigma_2 function (using crude sieving) implemented in python which is one of divisor functions http://mathworld.wolfram.com/DivisorFunction.html

from time import time
from itertools import count
import numpy

def sig2(N, nump=False):
    init = time()


    #initialize array with value=1 since every positive integer is divisible by 1
    if nump:
        print 'using numpy'
        nums = numpy.ones((N,), dtype=numpy.int64)
    else:        
        nums = [1 for i in xrange(1, N)]

    #for each number n < N, add n*n to n's multiples
    for n in xrange(2, N):
        nn = n*n
        for i in count(1):
            if n*i >= N: break
            nums[n*i-1] += nn

    print 'sig2(n) done - {} ms'.format((time()-init)*1000)

I tried it with varying values and with numpy it is quite disappointing.

for 2000:

sig2(n) done - 4.85897064209 ms
took : 33.7610244751 ms
using numpy
sig2(n) done - 31.5930843353 ms
took : 55.6900501251 ms

for 200000:

sig2(n) done - 1113.80600929 ms
took : 1272.8869915 ms
using numpy
sig2(n) done - 4469.48194504 ms
took : 4705.97100258 ms

it goes on and my code isn't really scalable - for it not being O(n), but with these two and besides these two result using numpy causes performance problems. Shouldn't numpy be faster than python lists and dicts? That was my impression on numpy.

I don't see how your function implements divisor functions. Which is the return value? — btel
– btel, Commented Nov 14, 2012 at 10:37
It returns list of numbers whose sigma_2 matches given criteria. Only a fraction of the code is here ... — thkang
– thkang, Commented Nov 14, 2012 at 11:24

Fred Foo · Accepted Answer · 2012-11-14 11:31:55Z

6

As @unutbu said numpy really shines when you use vectorised operations. Here is an optimised implementation using numpy (it is consistent with the definition of divisor function from Mathworld):

import numpy as np

def sig2_numpy(N):

    x = np.arange(1,N+1)
    x[(N % x) != 0] = 0
    return np.sum(x**2)

When you call it, it is much faster:

>> import time
>> init = time.time()
>> print sig2_numpy(20000)
>> print "It took", (time.time()-init)*1000., 'ms'
It took 0.916957855225 ms

edited Nov 14, 2012 at 11:31

Fred Foo

365k80 gold badges765 silver badges852 bronze badges

answered Nov 14, 2012 at 10:53

btel

5,6936 gold badges40 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Fred Foo Over a year ago

Please don't use the <> operator, it's been "considered obsolescent" at least since Python 2.0 (2000). I've replaced it with !=.

Fred Foo Over a year ago

Oh btw., np.sum(x**2) == np.dot(x,x). The latter might be faster because it doesn't produce the intermediate x**2 array.

btel Over a year ago

Yes, you are right %timeit np.sum(x**2) -> 8.88 ms per loop vs %timeit np.dot(x,x) -> 1.78 ms per loop (for 1e6 elements). I will leave the answer as it is, because it might be easier to understand what is going on.

unutbu · Accepted Answer · 2012-11-14 11:23:12Z

3

NumPy achieves speed by performing calculations on whole arrays, rather than on single values at a time.

When you write

    for i in count(1):
        if n*i >= N: break
        nums[n*i-1] += nn

you are forcing the NumPy array nums to increment single values in an array one index at a time. That is a slow operation for NumPy arrays.

edited Nov 14, 2012 at 11:23

answered Nov 14, 2012 at 10:36

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

3 Comments

thkang Over a year ago

for that part, if I use numpy to generate another array whose value is 0 if index isn't divisible by n and n*n for n's multiples and perform array addition operation, will it speedup my code?

Phil H Over a year ago

@thkang: correct answer: probably. Better answer: try it.

unutbu Over a year ago

It seems @btel has implemented what you've described.

Collectives™ on Stack Overflow

for huge arrays is numpy slower than list? [duplicate]

2 Answers 2

3 Comments

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

3 Comments

Linked

Related