3

I've seen it once or twice before, but I can't seem to find any official docs on it: Using python range objects as indices in numpy.

import numpy as np
a = np.arange(9).reshape(3,3)
a[range(3), range(2,-1,-1)]
# array([2, 4, 6])

Let's trigger an index error just to confirm that ranges are not in the official range (pun intended) of legal indexing methods:

a['x']

# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

Now, a slight divergence between numpy and its docs is not entirely unheard of and does not necessarily indicate that a feature is not intended (see for example here).

So, does anybody know why this works at all? And if it is an intended feature what are the exact semantics / what is it good for? And are there any ND generalizations?

13
  • I've never seen this; is it used in any reputable libraries? Commented Nov 2, 2018 at 17:34
  • 1
    numpy predates Python 3. In Python 2, range(3) is a list of integers, which numpy treats as "array-like". It would have been a mess if numpy didn't also handle that in a backwards compatible way in Python 3. Commented Nov 2, 2018 at 18:13
  • "So, does anybody know why this works at all?" It is a nice feature, informally called "fancy" indexing, and in the docs it is called advanced indexing. Commented Nov 2, 2018 at 18:19
  • 1
    @WarrenWeckesser That's right, it says there (...) a non-tuple sequence object (although a non-tuple sequence (such as a list) containing slice objects will trigger basic indexing, it seems). Not sure why it should hang if IndexError is not raised though, but whatever. I think you could make this an answer. Commented Nov 2, 2018 at 18:30
  • It could be that indexing tries np.asarray(x) with works with both range(3) and [0,1,2]. Other things produce errors or object dtype arrays. @WarrenWeckesser, makes a good point about compatibility with Py2's version of range. Commented Nov 2, 2018 at 18:36

2 Answers 2

3

Just to wrap this up (thanks to @WarrenWeckesser in the comments): This behavior is actually documented. One only has to realize that range objects are python sequences in the strict sense.

So this is just a case of fancy indexing. Be warned, though, that it is very slow:

>>> a = np.arange(100000)
>>> timeit(lambda: a[range(100000)], number=1000)
12.969507368048653
>>> timeit(lambda: a[list(range(100000))], number=1000)
7.990526253008284
>>> timeit(lambda: a[np.arange(100000)], number=1000)
0.22483703796751797
Sign up to request clarification or add additional context in comments.

Comments

1

Not a proper answer, but too long for comment.

In fact, it seems to work with about any indexable object:

import numpy as np

class MyIndex:
    def __init__(self, n):
        self.n = n
    def __getitem__(self, i):
        if i < 0 or i >= self.n:
            raise IndexError
        return i
    def __len__(self):
        return self.n

a = np.array([1, 2, 3])
print(a[MyIndex(2)])
# [1 2]

I think the relevant lines in NumPy's code are below this comment in core/src/multiarray/mapping.c:

/*
 * Some other type of short sequence - assume we should unpack it like a
 * tuple, and then decide whether that was actually necessary.
 */

But I'm not entirely sure. For some reason, this hangs if you remove the if i < 0 or i >= self.n: raise IndexError, even though there is a __len__, so at some point it seems to be iterating through the given object until IndexError is raised.

1 Comment

It iterating would be consistent with that it is actually quite slow, for example compared to indexing with aranges.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.