2

Say that I have two arrays a and b:

a = np.array([[1,2,3], [4,5,6], [7,8,9]])
b = np.array([[3,1,0], [1,2,3], [3,0,2]])

I want to select, from each row in a, the item which corresponds to the highest value (within row) in b, i.e. I want output [1, 6, 7].

What would be a fast solution to this problem in pandas/NumPy, and would it be faster than using a for-loop in regular python? It seems very simple, but I have not found a good solution. I'm a newcomer to pandas/NumPy, but I'm thinking there must be a simple solution to this?

1 Answer 1

2

You can use np.argmax with axis=1 to get the index of each largest value in b's rows,

then use advanced indexing to get the elements you want from a, like this:

import numpy as np

a = np.array([[1,2,3], [4,5,6], [7,8,9]])
b = np.array([[3,1,0], [1,2,3], [3,0,2]])

b_largest_idx = np.argmax(b, axis=1)

print(a[range(a.shape[0]),b_largest_idx])

Output:

[1 6 7]
Sign up to request clarification or add additional context in comments.

1 Comment

You can use list(range(a.shape[0])) instead of hard-typing [0,1,2].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.