Computing average for numpy array

Question

I have a 2d numpy array (6 x 6) elements. I want to create another 2D array out of it, where each block is the average of all elements within a blocksize window. Currently, I have the foll. code:

import os, numpy

def avg_func(data, blocksize = 2):
    # Takes data, and averages all positive (only numerical) numbers in blocks
    dimensions = data.shape

    height = int(numpy.floor(dimensions[0]/blocksize))
    width = int(numpy.floor(dimensions[1]/blocksize))
    averaged = numpy.zeros((height, width))

    for i in range(0, height):
        print i*1.0/height
        for j in range(0, width):
            block = data[i*blocksize:(i+1)*blocksize,j*blocksize:(j+1)*blocksize]
            if block.any():
                averaged[i][j] = numpy.average(block[block>0])

    return averaged

arr = numpy.random.random((6,6))
avgd = avg_func(arr, 3)

Is there any way I can make it more pythonic? Perhaps numpy has something which does it already?

UPDATE

Based on M. Massias's soln below, here is an update with fixed values replaced by variables. Not sure if it is coded right. it does seem to work though:

dimensions = data.shape 
height = int(numpy.floor(dimensions[0]/block_size)) 
width = int(numpy.floor(dimensions[1]/block_size)) 

t = data.reshape([height, block_size, width, block_size]) 
avrgd = numpy.mean(t, axis=(1, 3))

could you supply an input with expected output? your code is not very self-explanatory ^^ — daniel451
– daniel451, Commented Aug 31, 2015 at 20:05
modified code to be inclusive of sample input and expected output — user308827
– user308827, Commented Aug 31, 2015 at 20:12

P. Camilleri · Accepted Answer · 2015-09-01 05:44:40Z

2

To compute some operation slice by slice in numpy, it is very often useful to reshape your array and use extra axes.

To explain the process we'll use here: you can reshape your array, take the mean, reshape it again and take the mean again. Here I assume blocksize is 2

t = np.array([[0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5],[0, 1, 2, 3, 4, 5],[0, 1, 2, 3, 4, 5],[0, 1, 2, 3, 4, 5],[0, 1, 2, 3, 4, 5],])
t = t.reshape([6, 3, 2])
t = np.mean(t, axis=2)
t = t.reshape([3, 2, 3])
np.mean(t, axis=1)

outputs

array([[ 0.5,  2.5,  4.5],
       [ 0.5,  2.5,  4.5],
       [ 0.5,  2.5,  4.5]])

Now that it's clear how this works, you can do it in one pass only:

t = t.reshape([3, 2, 3, 2])
np.mean(t, axis=(1, 3))

works too (and should be quicker since means are computed only once - I guess). I'll let you substitute height/blocksize, width/blocksize and blocksize accordingly.

See @askewcan nice remark on how to generalize this to any dimension.

edited Sep 1, 2015 at 5:44

answered Aug 31, 2015 at 20:32

P. Camilleri

13.3k10 gold badges49 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

askewchan Over a year ago

To generalize beyond 2D (and to let blocksize be different along each axis), use a.reshape(shape).mean(axes) where shape = itertools.chain(*np.broadcast(a.shape/blocksize, blocksize)) and axes = tuple(range(1, 2*a.ndim, 2))

user308827 Over a year ago

thanks! @askewchan, I am getting the foll. error: TypeError: unsupported operand type(s) for /: 'tuple' and 'int', for the line: shape = itertools.chain(*numpy.broadcast(data.shape/block_size, block_size)).

askewchan Over a year ago

Oh, sorry, that assumes blocksize to be a numpy array.

user308827 Over a year ago

Hi M. Massias! Thanks for the solution. I am having a tough time replacing the constants in your solution by the variables. Would this work:

dimensions = data.shape     height = int(numpy.floor(dimensions[0]/block_size))     width  = int(numpy.floor(dimensions[1]/block_size))      t = data.reshape([height, block_size, width, block_size])     avrgd = numpy.mean(t, axis=(1, 3))

P. Camilleri Over a year ago

@user308827 This code works fine for me. What's going wrong, do you get unexpected values, error messages? You can use // for integer division instead of calling np.floor.

|

Collectives™ on Stack Overflow

Computing average for numpy array

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related