3

How can I extract the elements of a list corresponding to the indices contained in a 1D numpy.ndarray?

Here is an example:

list_data = list(range(1, 100))
arr_index = np.asarray([18, 55, 22])
arr_index.shape

list_data[arr_index]  # FAILS

I want to be able to retrieve the elements of list_data corresponding to arr_index.

3 Answers 3

4

You can use numpy.take -

import numpy as np
np.take(list_data,arr_index)

Sample run -

In [12]: list_data = list(range(1, 20))

In [13]: list_data
Out[13]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

In [14]: arr_index = np.asarray([3, 5, 12])

In [15]: np.take(list_data,arr_index)
Out[15]: array([ 4,  6, 13])
Sign up to request clarification or add additional context in comments.

11 Comments

So the first argument is array_like, which means that the list is cast to ndarray before use?
@Kasra What do you mean by simple indexing?
@Kasra Apparently simple indexing expects the first input to be a numpy array.
@fgnu: It's cast to ndarray. You can see in the source that for a list input, take ends up calling _wrapit, which calls asarray. Personally, I'd either use a list comprehension or convert the list to an array at the earliest possible point.
@fgnu: Well, if you're only using it once, a list comprehension won't have to go over every element of the list. Which option is fastest depends on how many items you expect to take. You might also try something like map(list_data.__getitem__, arr_index) and see if that helps.
|
1

OR

import numpy as np
list_data = list(range(1, 100))
arr_index = np.asarray([18, 55, 22])
arr_index.shape

new_ = [list_data[i] for i in arr_index]

>> [19, 56, 23]

Note

list_data = list(range(1, 100) can be replaced by list_data = range(1, 100)

arr_index = np.asarray([18, 55, 22]) can be replaced by arr_index = np.array([18, 55, 22])

1 Comment

In py3, you need list(range()) unless you are iterating right away.
1

I just did some timing tests:

In [226]: ll=list(range(20000))    
In [227]: ind=np.random.randint(0,20000,200)

In [228]: timeit np.array(ll)[ind]
100 loops, best of 3: 3.29 ms per loop

In [229]: timeit np.take(ll,ind)
100 loops, best of 3: 3.34 ms per loop

In [230]: timeit [ll[x] for x in ind]
10000 loops, best of 3: 65.1 µs per loop

In [231]: arr=np.array(ll)
In [232]: timeit arr[ind]
100000 loops, best of 3: 6 µs per loop

The list comprehension clearly is the winner. Indexing an array is clearly faster, but the overhead of creating that array is substantial.

Converting to an object dtype array is faster. I'm a little surprised, but it must be because it can convert without parsing:

In [236]: timeit np.array(ll,dtype=object)[ind].tolist()
1000 loops, best of 3: 1.04 ms per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.