2

Given a 2D NumPy array a and a list of indices stored in index, there must be a way of extracting the values of the list very efficiently. Using a for loop as follows take about 5 ms which seems extremely slow for 2000 elements to extract:

import numpy as np
import time

# generate dummy array 
a = np.arange(4000).reshape(1000, 4) 
# generate dummy list of indices
r1 = np.random.randint(1000, size=2000)
r2 = np.random.randint(3, size=2000)
index = np.concatenate([[r1], [r2]]).T

start = time.time()
result = [a[i, j] for [i, j] in index]
print time.time() - start

How can I increase the extraction speed? np.take does not seem appropriate here because it would return a 2D array instead of a 1D array.

2 Answers 2

2

You can use advanced indexing which basically means extract the row and column indices from the index array and then use it to extract values from a, i.e. a[index[:,0], index[:,1]] -

%timeit a[index[:,0], index[:,1]]
# 12.1 µs ± 368 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit [a[i, j] for [i, j] in index]
# 2.22 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Sign up to request clarification or add additional context in comments.

Comments

2

Another option would be numpy.ravel_multi_index, which lets you avoid the manual indexing.

np.ravel_multi_index(index.T, a.shape)

1 Comment

Thank you. It seems to be even faster because I can use indexT = index.T (once for all).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.