Slicing a Python list with a NumPy array of indices -- any fast way?

Question

I have a regular list called a, and a NumPy array of indices b.
(No, it is not possible for me to convert a to a NumPy array.)

Is there any way for me to the same effect as "a[b]" efficiently? To be clear, this implies that I don't want to extract every individual int in b due to its performance implications.

(Yes, this is a bottleneck in my code. That's why I'm using NumPy arrays to begin with.)

I don't know how much faster (if at all) operator.itemgetter() would be. — Ignacio Vazquez-Abrams
– Ignacio Vazquez-Abrams, Commented Jul 31, 2016 at 0:18
What is your plan for (what would be) a[b]? It's hard to imagine a use for it that won't "extract an individual int for ever integer in b"... eventually. If you're concerned about wasting space having a list and a sublist hanging around simultaneously, it seems that you could iterate (or whatever) over b at the time of need, instead of the (would be) a[b]. — jedwards
– jedwards, Commented Jul 31, 2016 at 0:33
@jedwards: My sentence was a little ambiguous (fixed), but what I was say was that I'm trying to avoid extracting the individual elements of b (I don't need it and it slows my code down). I'm using the extracted elements of a afterward (e.g. looking at which ones are None, etc... it's not really relevant) but that hardly implies I need to extract b's elements manually in the process. — user541686
– user541686, Commented Jul 31, 2016 at 0:43
itemgetter creates a class instance with a callable. I think its speed improvement comes from a simpler interpretation or calling stack. It does not use any special C code (that I can see). I get, at best, a 2x speed improvement. — hpaulj
– hpaulj, Commented Jul 31, 2016 at 0:59
For me (Python3, Numpy 1.10.4) -- itemgetter seems the fastest, a list comprehension over b being ~56% slower, and a comprehension over np.nditer(b) being ~134% slower. — jedwards
– jedwards, Commented Jul 31, 2016 at 1:00

John Zwinck · Accepted Answer · 2016-07-31 01:11:57Z

3

a = list(range(1000000))
b = np.random.randint(0, len(a), 10000)

%timeit np.array(a)[b]
10 loops, best of 3: 84.8 ms per loop

%timeit [a[x] for x in b]
100 loops, best of 3: 2.93 ms per loop

%timeit operator.itemgetter(*b)(a)
1000 loops, best of 3: 1.86 ms per loop

%timeit np.take(a, b)
10 loops, best of 3: 91.3 ms per loop

I had high hopes for numpy.take() but it is far from optimal. I tried some Numba solutions as well, and they yielded similar times--around 92 ms.

So a simple list comprehension is not far from the best here, but operator.itemgetter() wins, at least for input sizes at these orders of magnitude.

edited Jul 31, 2016 at 1:11

answered Jul 31, 2016 at 1:00

John Zwinck

252k44 gold badges346 silver badges459 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

user541686 Over a year ago

This method seems to have an inconsistency... itemgetter(*(0,))([(0, 0)]) returns a tuple of tuples, but itemgetter(*(0,))([(0,)]) returns just the tuple...

user541686 Over a year ago

Also, +1, but interestingly this didn't end up actually being faster in my original code (in fact it was a little slower).

juanpa.arrivillaga Over a year ago

@Mehrdad I'm getting a tuple returned by both of the above.

user541686 Over a year ago

@juanpa.arrivillaga: But are both of them tuples of tuples?

juanpa.arrivillaga Over a year ago

@Mehrdad No, I'm getting (0,0) and (0,), respectively.

|

HYRY · Accepted Answer · 2016-07-31 01:45:55Z

3

Write a cython function:

import cython
from cpython cimport PyList_New, PyList_SET_ITEM, Py_INCREF

@cython.wraparound(False)
@cython.boundscheck(False)
def take(list alist, Py_ssize_t[:] arr):
    cdef:
        Py_ssize_t i, idx, n = arr.shape[0]
        list res = PyList_New(n)
        object obj

    for i in range(n):
        idx = arr[i]
        obj = alist[idx]
        PyList_SET_ITEM(res, i, alist[idx])
        Py_INCREF(obj)

    return res

The result of %timeit:

import numpy as np

al= list(range(10000))
aa = np.array(al)

ba = np.random.randint(0, len(a), 10000)
bl = ba.tolist()

%timeit [al[i] for i in bl]
%timeit np.take(aa, ba)
%timeit take(al, ba)

1000 loops, best of 3: 1.68 ms per loop
10000 loops, best of 3: 51.4 µs per loop
1000 loops, best of 3: 254 µs per loop

numpy.take() is the fastest if both of the arguments are ndarray object. The cython version is 5x faster than list comprehension.

answered Jul 31, 2016 at 1:45

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

2 Comments

John Zwinck Over a year ago

Nice. Now I wonder why my Numba attempts were so slow. If you know Numba can you try to make a Numba function that's faster than the list comprehension? I couldn't.

HYRY Over a year ago

@JohnZwinck, numba make a copy of the list internal : numba.pydata.org/numba-doc/dev/reference/pysupported.html#list

Collectives™ on Stack Overflow

Slicing a Python list with a NumPy array of indices -- any fast way?

2 Answers 2

6 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related