Python Numpy - numpy axis performance

Question

By default, numpy is row major. Therefore, the following results are accepted naturally to me.

a = np.random.rand(5000, 5000)

%timeit a[0,:].sum()
3.57 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit a[:,0].sum()
38.8 µs ± 8.19 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Because it is a row major order, it is natural to calculate faster by a [0,:]. However, if use the numpy sum function, the result is different.

%timeit a.sum(axis=0)
16.9 ms ± 13.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit a.sum(axis=1)
29.5 ms ± 90.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

If use the numpy sum function, it is faster to compute it along the column.

So My point is why the speed along the axis = 0 (calculated along column) is faster than the along the axis = 1(along row).

For example

a = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]], order='C')

In the row major order, [1,2,3] and [4,5,6], [7,8,9] are allocated to adjacent memory, respectively.

Therefore, the speed calculated along axis = 1 should be faster than axis = 0. However, when using numpy sum function, it is faster to calculate along the column (axis = 0).

How can you explain this?

Thanks

a[0,:] only sums the first row. a.sum(axis=0) sums the entire thing along the row. You're not computing the same thing at all. — cs95
– cs95, Commented Jan 29, 2018 at 1:52
"However, if use the numpy built-in function, the result is different." Actually, you use "numpy built-in function" always! @COLDSPEED is exactly correct: you are computing different things. — AGN Gazer
– AGN Gazer, Commented Jan 29, 2018 at 9:08
a.sum (axis = 0) is the result calculated according to the column. The result of "print(a[0,:].sum(), a[1,:].sum(), a[2,:].sum())" computed along row is [3 12 21], whereas the result of a.sum (axis = 0) is [9, 12, 15]. — 신대규
– 신대규, Commented Jan 29, 2018 at 11:34
@신대규 Yeah, if you want the same output use axis=1, not 0. Also, the reason one is slower than the other is because of the cache misses and lack of locality of reference. — cs95
– cs95, Commented Jan 29, 2018 at 12:34
@cᴏʟᴅsᴘᴇᴇᴅ Thank you for your reply. It is very confused to me. I have also tested it on a Linux system. a = np.random.rand(5000, 5000) %timeit a.sum(axis=0) 100 loops, best of 3: 17 ms per loop %timeit a.sum(axis=1) 100 loops, best of 3: 15.5 ms per loop This is not a big difference, but it makes sense. But on window system. %timeit a.sum(axis=0) 17 ms ± 122 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit a.sum(axis=1) 29.7 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 10 loops each). It seems to be affected by the operating system or numpy version — 신대규
– 신대규, Commented Jan 29, 2018 at 13:00

cs95 · Accepted Answer · 2018-01-29 01:55:32Z

2

You don't compute the same thing.

The first two commands only compute one row/column out of the entire array.

a[0, :].sum().shape   # sums just the first row only
()

The second two commands, sum the entire contents of the 2D array, but along a certain axis. That way, you don't get a single result (as in the first two commands), but an 1D array of sums.

a.sum(axis=0).shape   # computes the row-wise sum for each column
(5000,)

In summary, the two sets of commands do different things.

a
array([[1, 6, 9, 1, 6],
       [5, 6, 9, 1, 3],
       [5, 0, 3, 5, 7],
       [2, 8, 3, 8, 6],
       [3, 4, 8, 5, 0]])

a[0, :]
array([1, 6, 9, 1, 6])

a[0, :].sum()
23

a.sum(axis=0)
array([16, 24, 32, 20, 22])

answered Jan 29, 2018 at 1:55

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Brad Solomon Over a year ago

Just also of note--if you do c = np.random.randn(5000, 5000) and f = np.asfortranarray(c), you do see non-negligible inverted timing differences for sums on the 0 and 1 axis.

Collectives™ on Stack Overflow

Python Numpy - numpy axis performance

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related