numpy 3d array - selecting maximum elements along given axis

Question

I have a numpy array that has dimensions (x, y, z) = (5, 50, 4). For every (x, y) pair, I want to find the index of the maximum value along the z axis. This index is in range(4). I want to select all of these "maximum" elements and set them to 1. Then, I want to select all other elements and set them to zero.

To explain it another way, I want to look at all vectors in the "z" direction (there are x*y of these vectors total). I want to set the maximum element to 1 and all other elements to 0. For example, the vector (0.25, 0.1, 0.5, 0.15) will become (0, 0, 1, 0).

I've tried many different things. The argmax function seems like it should help. But how do I use it to select elements correctly? I have tried things like...

x = data
i = x.argmax(axis = 2)
x[i] # shape = (5, 50, 50, 4)
x[:,:,i] # shape = (5, 50, 5, 50)
x[np.unravel_index(i), x.shape] # shape = (5, 50)

The last one, which uses np.unravel_index, has the correct shape, but the selected indices are NOT the maximum values along the z axis. So I'm having some trouble. If anyone could help at all, it would be really awesome. Thanks!

Edit: Here is a way I have found to do this. But if anyone has anything that is faster, please let me know!

def fix_vector(a):
    i = a.argmax()
    a = a*0
    a[i] = 1
    return a

y = np.apply_along_axis(fix_vector, axis=2, arr=x)

I would really like to optimize this if possible, since I call this function MANY times.

Edit: Thanks DSM for a nice solution. Here is a small example dataset, as requested in the comments.

data = np.random.random((3,5,4))
desired_output = np.apply_along_axis(fix_vector, axis=2, arr=data)

This uses the fix_vector function I posted above, but DSM's solution is faster. Thanks again!

You should add a, small, example of the input array and the desired result after processing. — wwii
– wwii, Commented Aug 13, 2015 at 3:40

DSM · Accepted Answer · 2015-08-13 03:41:01Z

2

This isn't particularly elegant, but:

def faster(x):
    d = x.reshape(-1, x.shape[-1])
    d2 = np.zeros_like(d)
    d2[np.arange(len(d2)), d.argmax(1)] = 1
    d2 = d2.reshape(x.shape)
    return d2

seems a little faster than the fix_vector approach. For example:

>>> x,y,z = 5,50,4
>>> data = np.random.random((x,y,z))
>>> np.allclose(orig(data), faster(data))
True
>>> %timeit -n 1000 orig(data)
1000 loops, best of 3: 5.77 ms per loop
>>> %timeit -n 1000 faster(data)
1000 loops, best of 3: 36.6 µs per loop

answered Aug 13, 2015 at 3:41

DSM

355k67 gold badges606 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

wwii · Accepted Answer · 2015-08-13 04:41:31Z

1

This can be done with numpy.where

import numpy as np
a = np.array([[[ 0.25,  0.10 ,  0.50 ,  0.15],
               [ 0.50,  0.60 ,  0.40 ,  0.30]],
              [[ 0.25,  0.50 ,  0.20 ,  0.70],
               [ 0.80,  0.10 ,  0.50 ,  0.15]]])

Find the max value along the last axis

b = a.max(-1)    #shape is (2,2)

Add an axis to b so that it will broadcast across a and create a boolean array.

condition = a == b[..., np.newaxis]

Use numpy.where for the substitution.

c = np.where(condition, 1, 0)

>>> c
array([[[0, 0, 1, 0],
        [0, 1, 0, 0]],

       [[0, 0, 0, 1],
        [1, 0, 0, 0]]])

def f(a):
    b = a.max(-1)
    condition = a == b[..., np.newaxis]
    return np.where(condition, 1, 0)

So I wasn't satisfied - I've seen cases where reshape is faster than adding a new axis so I played around a bit. Seems numpy.where itself is a tad slow. Using boolean indexing with an assignment gives pretty much the same performance as @DSM's.

def h(a):
    b = a.max(-1)
    condition = a == b[..., np.newaxis]
    a[condition] = 1
    a[np.logical_not(condition)] = 0
    return a

edited Aug 13, 2015 at 4:41

answered Aug 13, 2015 at 4:17

wwii

23.9k7 gold badges42 silver badges80 bronze badges

4 Comments

wwii Over a year ago

Comparable but a tad (50%) slower than @DSM's - looks like they both scale the same with the size of the array.

DSM Over a year ago

Note that conceivably this could result in multiple 1s in a row if the maximum isn't unique. But for all we know, this might be what the OP wants to happen in that case (the question is ambiguous on the point).

wwii Over a year ago

@DSM - yours chooses the lowest index for multiple maximums in a row?

adn bps Over a year ago

thanks for the solution, wwii. this is exactly what i was looking for. just fyi, for my application, i never encounter two identical values, so both solutions should work.

hpaulj · Accepted Answer · 2015-08-13 16:52:44Z

Sample x, (2,3,4) in shape:

In [934]: x
Out[934]: 
array([[[ 5,  7,  4, 15],
        [ 9, 23, 14,  9],
        [17, 13,  2,  1]],

       [[ 2, 18,  3, 18],
        [10, 12, 23, 14],
        [17, 23, 19, 13]]])

The max values along the last axis - clear enough, a (2,3) array:

In [935]: x.max(2)
Out[935]: 
array([[15, 23, 17],
       [18, 23, 23]])

Cooresponding indexes along that axis

In [936]: I=x.argmax(2)

In [937]: I
Out[937]: 
array([[3, 1, 0],
       [1, 2, 1]], dtype=int32)

One way of indexing x with those indexes. At this point I'm constructing the 0 and 1 axis indexes manually. It needs to be automated

In [938]: x[np.arange(2)[:,None], np.arange(3)[None,:], I]
Out[938]: 
array([[15, 23, 17],
       [18, 23, 23]])

The same indexes can be used to set values in another array:

In [939]: y=np.zeros_like(x)   
In [940]: y[np.arange(2)[:,None],np.arange(3)[None,:],I]=x.max(2)
In [941]: y
Out[941]: 
array([[[ 0,  0,  0, 15],
        [ 0, 23,  0,  0],
        [17,  0,  0,  0]],

       [[ 0, 18,  0,  0],
        [ 0,  0, 23,  0],
        [ 0, 23,  0,  0]]])

The trick is to come up with a more streamlined way of generating the indexing tuple:

(np.arange(2)[:,None], np.arange(3)[None,:], I)

(array([[0],
    [1]]), 
 array([[0, 1, 2]]), 
 array([[3, 1, 0],
    [1, 2, 1]], dtype=int32))

np.ix_ helps, though concatenating 2 tuples is a little ugly:

J = np.ix_(range(x.shape[0]), range(x.shape[1])) + (I,)

Indexing on a flattened array can be faster:

In [998]: J1 = np.ravel_multi_index(J, x.shape)
In [999]: J1
Out[999]: 
array([[ 3,  5,  8],
     [13, 18, 21]], dtype=int32)

In [1000]: x.flat[J1]
Out[1000]: 
array([[15, 23, 17],
   [18, 23, 23]])

In [1001]: y.flat[J1]=2

In [1002]: y
Out[1002]: 
array([[[0, 0, 0, 2],
        [0, 2, 0, 0],
        [2, 0, 0, 0]],

       [[0, 2, 0, 0],
        [0, 0, 2, 0],
        [0, 2, 0, 0]]])

But I still need to construct the J indexing tuple.

This is similar to DSMs solution, but sticks with the 3 dimensions. Flattening 2 of the dimensions eliminates the need for the xi_ construction.

def max_index(x, n, fn=np.argmax):
    # return multidimensional indexing tuple
    # applying fn along axis n
    I = fn(x, axis=n)
    alist = list(np.ix_(*[range(i) for i in I.shape]))
    alist.insert(n, I)
    return tuple(alist)

is a general purpose function to generate the J like indexing tuple

e.g.

y = np.zeros_like(x)
I = max_index(x, 1, np.argmin)
y[I] = x[I]
I = max_index(x, 1, np.argmax)
y[I] = x[I]

inserts both the minimum and maximum values of x (along axis 1) into y. Speed is about half of DSMs faster.

thanks for the solution, hpaulj. i think DSM and wwii provided more elegant solutions, but it was interesting to see how this works too.

Collectives™ on Stack Overflow

numpy 3d array - selecting maximum elements along given axis

3 Answers 3

Comments

4 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related