8

For a 1-D numpy array a, I thought that np.sum(a) and a.sum() are equivalent functions, but I just did a simple experiment, and it seems that the latter is always a bit faster:

In [1]: import numpy as np

In [2]: a = np.arange(10000)

In [3]: %timeit np.sum(a)
The slowest run took 16.85 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.46 µs per loop

In [4]: %timeit a.sum()
The slowest run took 19.80 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.25 µs per loop

Why is there a difference? Does this mean that we should always use the numpy.ndarray version of functions like sum, mean, std, etc.?

2
  • 3
    Mostly you are seeing a difference in one level of function redirection. In most of these cases the function version redirects the task to the method (look at the code). Don't worry about speed here - use the form that makes your code clearest (to you and your readers). You must use the function version if your input might be a list instead of an array. Commented Feb 23, 2018 at 6:48
  • Last year I answered something similar np.sum and np.add.reduce - in production, what do you use? Commented Feb 23, 2018 at 6:59

1 Answer 1

4

I'd imagine it is becasue np.sum() and the like needs to explicitly convert the inputs to ndarray first (using np.asanyarray) checks a few other .sum functions before settling on the ndarray.sum method in order to allow operation on lists, tuples, etc.

On the other hand, ndarray.sum() is a method of the ndarray class and thus doesn't need to do any checking.

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, but I thought the conversion should just boil down to a simple check if a is already an array, and shouldn't involve explicit copying, right?
You'd think so, but it actually seems to deduce that a is an ndarray only if nothing else's .sum method works, including generators and the old numeric multiarray
Which makes sense, you don't want to convert a big linked list to ndarray if you don't have to
@Hilbert: No copy is made - the overhead is O(1), not O(N).
@Eric My guess is that is to maintain the functionality of the standard sum in case you do from numpy import * or from pylab import *.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.