Write a cython function:
import cython
from cpython cimport PyList_New, PyList_SET_ITEM, Py_INCREF
@cython.wraparound(False)
@cython.boundscheck(False)
def take(list alist, Py_ssize_t[:] arr):
cdef:
Py_ssize_t i, idx, n = arr.shape[0]
list res = PyList_New(n)
object obj
for i in range(n):
idx = arr[i]
obj = alist[idx]
PyList_SET_ITEM(res, i, alist[idx])
Py_INCREF(obj)
return res
The result of %timeit:
import numpy as np
al= list(range(10000))
aa = np.array(al)
ba = np.random.randint(0, len(a), 10000)
bl = ba.tolist()
%timeit [al[i] for i in bl]
%timeit np.take(aa, ba)
%timeit take(al, ba)
1000 loops, best of 3: 1.68 ms per loop
10000 loops, best of 3: 51.4 µs per loop
1000 loops, best of 3: 254 µs per loop
numpy.take() is the fastest if both of the arguments are ndarray object. The cython version is 5x faster than list comprehension.
operator.itemgetter()would be.a[b]? It's hard to imagine a use for it that won't "extract an individualintfor ever integer inb"... eventually. If you're concerned about wasting space having a list and a sublist hanging around simultaneously, it seems that you could iterate (or whatever) overbat the time of need, instead of the (would be)a[b].b(I don't need it and it slows my code down). I'm using the extracted elements ofaafterward (e.g. looking at which ones areNone, etc... it's not really relevant) but that hardly implies I need to extractb's elements manually in the process.itemgettercreates a class instance with a callable. I think its speed improvement comes from a simpler interpretation or calling stack. It does not use any special C code (that I can see). I get, at best, a 2x speed improvement.itemgetterseems the fastest, a list comprehension overbbeing ~56% slower, and a comprehension overnp.nditer(b)being ~134% slower.