Finding relative maximums of a 2-D numpy array

Question

I have a 2-D numpy array that can be subdivided into 64 boxes (think of a chessboard). The goal is a function that returns the position and value of the maximum in each box. Something like:

FindRefs(array) --> [(argmaxX00, argmaxY00, Max00), ...,(argmaxX63, argmaxY63, Max63)]

where argmaxXnn and argmaxYnn are the indexes of the whole array (not of the box), and Maxnn is the max value in each box. In other words,

Maxnn = array[argmaxYnn,argmaxYnn]

I've tryed the obvious "nested-for" solution:

def FindRefs(array):
    Height, Width = array.shape
    plumx = []
    plumy = []
    lum = []
    w = int(Width/8)
    h = int(Height/8)
    for n in range(0,8):    # recorrer boxes
        x0 = n*w
        x1 = (n+1)*w
        for m in range(0,8):
            y0 = m*h
            y1 = (m+1)*h
            subflatind = a[y0:y1,x0:x1].argmax() # flatten index of box
            y, x = np.unravel_index(subflatind, (h, w))
            X = x0 + x
            Y = y0 + y
            lum.append(a[Y,X])
            plumx.append(X)
            plumy.append(Y)

    refs = []
    for pt in range(0,len(plumx)):
        ptx = plumx[pt]
        pty = plumy[pt]
        refs.append((ptx,pty,lum[pt]))
    return refs

It works, but is neither elegant nor eficient. So I've tryed this more pythonic version:

def FindRefs(a):
    box = [(n*w,m*h) for n in range(0,8) for m in range(0,8)]
    flatinds = [a[b[1]:h+b[1],b[0]:w+b[0]].argmax() for b in box]
    unravels = np.unravel_index(flatinds, (h, w))
    ur = [(unravels[1][n],unravels[0][n]) for n in range(0,len(box))]
    absinds = [map(sum,zip(box[n],ur[n])) for n in range(0,len(box))]
    refs = [(absinds[n][0],absinds[n][1],a[absinds[n][1],absinds[n][0]]) for n in range(0,len(box))] 
    return refs

It works fine but, to my surprise, is not more efficient than the previous version!

The question is: Is there a more clever way to do the task?

Note that efficiency matters, as I have many large arrays for processing.

Any clue is welcome. :)

possible duplicate of How can I efficiently process a numpy array in blocks similar to Matlab's blkproc (blockproc) function — Bas Swinckels
– Bas Swinckels, Commented Mar 8, 2015 at 18:01

cge · Accepted Answer · 2015-03-09 23:47:35Z

2

Try this:

from numpy.lib.stride_tricks import as_strided as ast
import numpy as np
def FindRefs3(a):
    box = tuple(x/8 for x in a.shape)
    z=ast(a, \
          shape=(8,8)+box, \
          strides=(a.strides[0]*box[0],a.strides[1]*box[1])+a.strides)
    v3 = np.max(z,axis=-1)
    i3r = np.argmax(z,axis=-1)
    v2 = np.max(v3,axis=-1)
    i2 = np.argmax(v3,axis=-1)
    i2x = np.indices(i2.shape)
    i3 = i3r[np.ix_(*[np.arange(x) for x in i2.shape])+(i2,)]
    i3x = np.indices(i3.shape)
    ix0 = i2x[0]*box[0]+i2
    ix1 = i3x[1]*box[1]+i3
    return zip(np.ravel(ix0),np.ravel(ix1),np.ravel(v2))

Note that your first FindRefs reverses indices, so that for a tuple (i1,i2,v), a[i1,i2] won't return the right value, whereas a[i2,i1] will.

So here's what the code does:

It first calculates the dimensions that each box needs to have (box) given the size of your array. Note that this doesn't do any checking: you need to have an array that can be divided evenly into an 8 by 8 grid.
Then z with ast is the messiest bit. It takes the 2d array, and turns it into a 4d array. The 4d array has dimensions (8,8,box[0],box[1]), so it lets you choose which box you want (the first two axes) and then what position you want in the box (the next two). This lets us deal with all the boxes at once by doing operations on the last two axes.
v3 gives us the maximum values along the last axis: in other words, it contains the maximum of each column in each box. i3r contains the index of which row in the box contained that max value.
v2 takes the maximum of v3 along its own last axis, which is now dealing with rows in the box: it takes the column maxes, and finds the maximum of them, so that v2 is a 2d array containing the maximum value of each box. If all you wanted were the maximums, this is all you'd need.
i2 is the index of the column in the box that holds the maximum value.
Now we need to get the index of the row in the box... that's trickier. i3r contains the row index of the max of each column in the box, but we want the row for the specific column that's specified in i2. We do this by choosing an element from i3r using i2, which gives us i3.
At this point, i2 and i3 are 8 by 8 arrays containing the row and column indexes of the maximums relative to each box. We want the absolute indexes. So we create i2x and i3x (actually, this is pointless; we could just create one, as they are the same), which are just arrays of what the indexes for i2 and i3 are (0,1,2,...,8 etc in one dimension, and so on). We then multiply these by the box sizes, and add the relative max indexes, to get the absolute max indexes.
We then combine these to get the same output that you had. Note that if you keep them as arrays, though, instead of making tuples, it's much faster.

edited Mar 9, 2015 at 23:47

answered Mar 8, 2015 at 19:27

cge

10k3 gold badges36 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Veedrac Over a year ago

Note that you don't need backslashes in the call to ast since it's surrounded by brackets.

cge Over a year ago

True - for some reason I tend to obsessively place backslashes everywhere, perhaps worrying that if I don't, I'll forget when I switch to another language that needs them.

Rodolfo Leibner Over a year ago

Wow, definitely strides stuff is the key ! I must study Numpy more thoroughly. Meanwhile, I'm trying to understand the logic of your code

Rodolfo Leibner Over a year ago

, since the line " i3 = i2.choose(np.rollaxis(i3r,-1)) " reports: ValueError: Need between 2 and (32) array objects (inclusive).

cge Over a year ago

Bah - I should have known that undocumented limitation would come up. See stackoverflow.com/q/28931900/599265 for more info on that point. I'll try to get that the NPY_MAXARGS limitation on np.choose documented, because it's rather annoying to have it suddenly appear. In any case, the modified version is more confusing, but should work.

|

Collectives™ on Stack Overflow

Finding relative maximums of a 2-D numpy array

1 Answer 1

7 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related