Index numpy nd array along last dimension

Question

Is there an easy way to index a numpy multidimensional array along the last dimension, using an array of indices? For example, take an array a of shape (10, 10, 20). Let's assume I have an array of indices b, of shape (10, 10) so that the result would be c[i, j] = a[i, j, b[i, j]].

I've tried the following example:

a = np.ones((10, 10, 20))
b = np.tile(np.arange(10) + 10, (10, 1))
c = a[b]

However, this doesn't work because it then tries to index like a[b[i, j], b[i, j]], which is not the same as a[i, j, b[i, j]]. And so on. Is there an easy way to do this without resorting to a loop?

Just to make sure I understand properly, you want c[i, j] = a[i, j, b[i, j]] — mgilson
– mgilson, Commented Dec 3, 2014 at 17:00
This is normally done like c = a[np.arange(b.shape[0]), np.arange(b.shape[1]), b] but I'm hoping there's a better way. — user2379410
– user2379410, Commented Dec 3, 2014 at 20:31
@moarningsun That fails my correctness test. I think you must convert one of the two aranges to a column vector or so. — Bas Swinckels
– Bas Swinckels, Commented Dec 3, 2014 at 21:22

Bas Swinckels · Accepted Answer · 2014-12-04 08:28:30Z

6

There are several ways to do this. Let's first generate some test data:

In [1]: a = np.random.rand(10, 10, 20)

In [2]: b = np.random.randint(20, size=(10,10))  # random integers in range 0..19

One way to solve the question would be to create two index vectors, where one is a row vector and the other a column vector of 0..9 using meshgrid:

In [3]: i1, i0 = np.meshgrid(range(10), range(10), sparse=True)

In [4]: c = a[i0, i1, b]

This works because i0, i1 and b will all be broadcasted to 10x10 matrices. Quick test for correctness:

In [5]: all(c[i, j] == a[i, j, b[i, j]] for i in range(10) for j in range(10))
Out[5]: True

Another way would be to use choose and rollaxis:

# choose needs a sequence of length 20, so move last axis to front
In [22]: aa = np.rollaxis(a, -1)  

In [23]: c = np.choose(b, aa)

In [24]: all(c[i, j] == a[i, j, b[i, j]] for i in range(10) for j in range(10))
Out[24]: True

edited Dec 4, 2014 at 8:28

answered Dec 3, 2014 at 21:01

Bas Swinckels

18.5k3 gold badges48 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

tiago Over a year ago

Thank you. I like your first option better. The choose option looks nice, but it is not general enough as choose does not work for indices larger than 32 (a current bug in numpy, see discussion on increasing NPY_MAXARGS). According to my tests, it is also twice as slow as the meshgrid option.

Bas Swinckels Over a year ago

In that case, you should probably look into functions like mgrid and ogrid, which are referenced from the meshgrid documentation. But do you work with matrices with more than 32 dimensions? I feel sorry for you, my head already explodes when trying to understand an array with more than 2 dimensions :).

Bas Swinckels Over a year ago

Ah, I see your error now when going from 10x10x20 to 10x10x40. I guess choose unpacks the 3D matrix into a list of 2D matrices along the first dimension.

KobeJohn Over a year ago

@BasSwinckels I tried this with unequal row and column sizes and it didn't work (IndexError: shape mismatch: indexing arrays could not be broadcast together). I'm not familiar with meshgrid so could you update it?

Collectives™ on Stack Overflow

Index numpy nd array along last dimension

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related