0

What is the difference between indexing a 2D array row/col with [row][col] vs [row, col] in numpy/pandas? Is there any implications of using either of these two?

For example:

import numpy as np

arr = np.array([[1, 2], [3, 4]])
print(arr[1][0])
print(arr[1, 0])

Both give 3.

2
  • i meant in numpy array Commented Apr 5, 2021 at 9:31
  • 1
    In the penultimate paragraph of the section numpy.org/devdocs/user/… it is stated that "...So note that x[0,2] = x[0][2] though the second case is more inefficient". As to why, I suggest you read the rest from the link. Commented Apr 5, 2021 at 9:37

2 Answers 2

2

Single-element indexing

For single elements indexing as in your example, the result is indeed the same. Although as stated in the docs:

So note that x[0,2] = x[0][2] though the second case is more inefficient as a new temporary array is created after the first index that is subsequently indexed by 2.

emphasis mine

Array indexing

In this case, not only that double-indexing is less efficient - it simply gives different results. Let's look at an example:

>>> arr = np.array([[1, 2], [3, 4], [5, 6]])
>>> arr[1:][0]
[3 4]
>>> arr[1:, 0]
[3 5]

In the first case, we create a new array after the first index which is all rows from index 1 onwards:

>>> arr[1:]
[[3 4]
 [5 6]]

Then we simply take the first element of that new array which is [3 4].

In the second case, we use numpy indexing which doesn't index the elements but indexes the dimensions. So instead of taking the first row, it is actually taking the first column - [3 5].

Sign up to request clarification or add additional context in comments.

Comments

1

Using [row][col] is one more function call than using [row, col]. When you are indexing an array (in fact, any object, for that matter), you are calling obj.__getitem__ under the hook. Since Python wraps the comma in a tuple, doing obj[row][col] is the equivalent of calling obj.__getitem__(row).__getitem__(col), whereas obj[row, col] is simply obj.__getitem__((row,col)). Therefore, indexing with [row, col] is more efficient because it has one fewer function call (plus some namespace lookups but they can normally be ignored).

1 Comment

The function call is probably less important than the fact that an intermediate array is created from the first call, which requires overhead to determine things about the intermediate result (contiguity, memory layout, etc), that is then just being thrown away.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.