How to properly broadcast array indexing for NumPy arrays

Question

Short description:

I have two numpy arrays.

data, data.shape is a tuple with X entries
indices,indices.shape is the tuple (X,Y)

indices is basically a list of index arrays. The arrays along the second dimension specify lists of indices for the corresponding dimension in data.

indices[0,:] is a list of indices for the first dimension of data.
indices[1,:] is a list of indices for the second dimension of data.

I would like to have a combination of all of them, an outer product.

The syntax I would like to use is simply:

data[indices]

EDIT:

Here was a long step-by-step review of all the things I tried, now obsolete I found the solution, it's below.

UPDATE:

I found a solution. In my answer below, there's an explanation how this indexing works. You probably want to use Divakar's version though, he shows the np.ix() command which does exactly what is needed in one call.

@downvoter, could you at least comment ?

lhk
– lhk

2016-12-08 16:28:38 +00:00
Commented Dec 8, 2016 at 16:28 — lhk
– lhk, Commented Dec 8, 2016 at 16:28

Divakar · Accepted Answer · 2016-12-09 08:27:02Z

2

We can simply use np.ix_ for creating such broadcastable indexing arrays, which could then be directly used for indexing. Thus, with indices as an array of shape (M, N), where N would represent the number of dimension in the data array, we would have an implementation that works on ndarrays of any number of dimensions, like so -

data[np.ix_(*indices.T)]

If indices is of shape (N, M), where N represents the number of dimension in the data array, skip the transpose : data[np.ix_(*indices)].

edited Dec 9, 2016 at 8:27

answered Dec 8, 2016 at 17:09

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

lhk Over a year ago

beautiful, it's the same code I posted below but in one handy command. Thank you very much

lhk Over a year ago

I just tried your code. I think the transposing of the indices is wrong. indices has the shape (X,Y) where X is the number of dimensions of data. In my test transposing indices led to an error "too many indices for array"

Divakar Over a year ago

@lhk As stated in the post : "... with indices as an array of shape (M, N), where N would represent the number of dimension in the data array". If the indices is the other way as in your case : "shape (X,Y) where X is the number of dimensions of data"', then don't transpose and just use : data[np.ix_(*indices)]. Updated the post.

lhk · Accepted Answer · 2016-12-08 17:00:48Z

The best way to select items from a data array is by using tuples.

You need to create a tuple, where the items of the tuple are lists of indices. But there is a trick: The lists will be zipped together and this zipping broadcasts the lists.

Take this tuple for example:

( [0,1], [0,1] )

will take the elements [0,0] and [1,1].

You can try to use commands like np.tile or np.repeat, to match the correct indices with each other. I tried and it's very complicated to get that right.

There's an easier way:

indices=np.arange(10)

data=np.random.randn(10,10,10)

x_list=indices
y_list=indices
z_list=indices

# this will broadcast the lists together
# the shapes (10,) , (10, ) and (10, ) result in the shape (10, )
data[(x_list, y_list, z_list)].shape    # (10,)

#now the trick
x_list=x_list.reshape((-1, 1, 1))
y_list=y_list.reshape((1, -1, 1))
z_list=z_list.reshape((1, 1, -1))

# this will do broadcasting, too
# but now the shapes are (10, 1, 1), (1, 10, 1) and (1, 1, 10)
data[(x_list, y_list, z_list)].shape # (10, 10, 10)

Collectives™ on Stack Overflow

How to properly broadcast array indexing for NumPy arrays

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related