23

In numpy is there a fast way of calculating the mean across multiple axis? I am calculating the mean on all but the 0 axis of an n-dimensional array.

I am currently doing this;

for i in range(d.ndim - 1):
    d = d.mean(axis=1)

I'm wondering if there is a solution that doesn't use a python loop.

0

4 Answers 4

43

In numpy 1.7 you can give multiple axis to np.mean:

d.mean(axis=tuple(range(1, d.ndim)))

I am guessing this will perform similarly to the other proposed solutions, unless reshaping the array to flatten all dimensions triggers a copy of the data, in which case this should be much faster. So this is probably going to give a more consistent performance.

Sign up to request clarification or add additional context in comments.

4 Comments

Excellent! I'm on numpy 1.6 currently though.
I cannot confirm that this feature exists, neither in version 1.7.1 or version 1.8.2 of numpy. Wishful thinking?
Have you tried it and it didn't work? It was there in 1.7, trust me. The change was made to all ufuncs, so it propagated automatically to all functions, like np.mean, that rely on np.add. It wasn't added to the docs of mean and other similar functions, although there is an open PR to fix that. But you can take a look at the docs of np.sum which were updated and have the relevant change tagged as New in version 1.7.0.
@Jamie D'oh! I had tried it, but with a list of integers, rather than a tuple, inspired by your example that uses range (which in Python 2.7 returns a list). In my opinion, the numpy functions should be indifferent to the type, supporting any iterable (list, tuple, range...). Until then, you should change your example so it actually works. =)
10

My approach would be to reshape the array to flatten all of the higher dimensions and then run the mean on axis 1. Is this what your looking for?

In [14]: x = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])

In [16]: x.reshape((x.shape[0], -1)).mean(axis=1)
Out[16]: array([ 2.5,  6.5])

(step 2 just calculates the product of the lengths of the higher dims)

3 Comments

Thanks! Also, -1 can be used instead of higher_dims.
@dsg101 awesome! Editing now!
For numpy 1.7 and newer, see stackoverflow.com/a/17403375/2910092
3

You can also use numpy.apply_over_axes:

import numpy as np

x = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
y = np.apply_over_axes(np.mean, x, (1, 2))
y = array([[[ 2.5]],[[ 6.5]]])

Comments

0

Following on from the suggestion of @dsg101, is this the sort of thing you want?

>>> import numpy as np
>>> d=np.reshape(np.arange(5*4*3),[5,4,3])
>>> d
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]],

       [[12, 13, 14],
        [15, 16, 17],
        [18, 19, 20],
        [21, 22, 23]],

       [[24, 25, 26],
        [27, 28, 29],
        [30, 31, 32],
        [33, 34, 35]],

       [[36, 37, 38],
        [39, 40, 41],
        [42, 43, 44],
        [45, 46, 47]],

       [[48, 49, 50],
        [51, 52, 53],
        [54, 55, 56],
        [57, 58, 59]]])
>>> np.mean(np.reshape(d,[d.shape[0],np.product(d.shape[1:])]),axis=1)
array([  5.5,  17.5,  29.5,  41.5,  53.5])

2 Comments

Yes, I think this would be faster than the OP soln.
np.product(d.shape[1:]) is better spelt -1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.