4

I have a 3d numpy array (n_samples x num_components x 2) in the example below n_samples = 5 and num_components = 7.

I have another array (indices) which is the selected component for each sample which is of shape (n_samples,).

I want to select from the data array given the indices so that the resulting array is n_samples x 2.

The code is below:

import numpy as np
np.random.seed(77)
data=np.random.randint(low=0, high=10, size=(5, 7, 2))
indices = np.array([0, 1, 6, 4, 5])
#how can I select indices from the data array?

For example for data 0, the selected component should be the 0th and for data 1 the selected component should be 1.

Note that I can't use any for loops because I'm using it in Theano and the solution should be solely based on numpy.

3 Answers 3

5

Is this what you are looking for?

In [36]: data[np.arange(data.shape[0]),indices,:]
Out[36]: 
array([[7, 4],
       [7, 3],
       [4, 5],
       [8, 2],
       [5, 8]])
Sign up to request clarification or add additional context in comments.

1 Comment

yeah, that's it. I think this is the best answer.
4

To get component #0, use

data[:, 0]

i.e. we get every entry on axis 0 (samples), and only entry #0 on axis 1 (components), and implicitly everything on the remaining axes.

This can be easily generalized to

data[:, indices]

to select all relevant components.


But what OP really wants is just the diagonal of this array, i.e. (data[0, indices[0]], (data[1, indices[1]]), ...) The diagonal of a high-dimensional array can be extracted using the diagonal function:

>>> np.diagonal(data[:, indices])
array([[7, 7, 4, 8, 5],
       [4, 3, 5, 2, 8]])

(You may need to transpose the result.)

8 Comments

data[:, indices] results in shape (5, 5, 2) while I need a (5, 2) shape.
@Afshin data[:, 0] is (5, 2), for a single component. What (5, 2) array you want when you have got five components?
data[:,indices][np.arange(n_samples),np.arange(n_samples)] works but is obtuse.
@Afshin What about np.diagonal(data[:, indices]).T
@Afshin Updated. I guess hpaulj's one is faster since it doesn't need to build the intermediate #Samples×#Indices×2 array.
|
2

You have a variety of ways to do so, but this is my loop recommendation:

selection = np.array([ datum[indices[k]] for k,datum in enumerate(data)])

The resulting array, selection, has the desired shape.

2 Comments

I can't use for loops because I'm using Theano, need a solely numpy solution.
I think you should add that constraint to your original post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.