7

I have:

>>> a
array([[1, 2],
       [3, 4]])

>>> type(l), l # list of scalers
(<type 'list'>, [0, 1])

>>> type(i), i # a numpy array
(<type 'numpy.ndarray'>, array([0, 1]))

>>> type(j), j # list of numpy arrays
(<type 'list'>, [array([0, 1]), array([0, 1])])

When I do

>>> a[l] # Case 1, l is a list of scalers

I get

array([[1, 2],
       [3, 4]])

which means indexing happened only on 0th axis.

But when I do

>>> a[j] # Case 2, j is a list of numpy arrays

I get

array([1, 4])

which means indexing happened along axis 0 and axis 1.

Q1: When used for indexing, why is there a difference in treatment of list of scalers and list of numpy arrays ? (Case 1 vs Case 2). In Case 2, I was hoping to see indexing happen only along axis 0 and get

array( [[[1,2],
          [3,4]], 

        [[1,2],
         [3,4]]])

Now, when using numpy array of arrays instead

>>> j1 = np.array(j) # numpy array of arrays

The result below indicates that indexing happened only along axis 0 (as expected)

>>> a[j1] Case 3, j1 is a numpy array of numpy arrays
array([[[1, 2],
        [3, 4]],

       [[1, 2],
        [3, 4]]])

Q2: When used for indexing, why is there a difference in treatment of list of numpy arrays and numpy array of numpy arrays? (Case 2 vs Case 3)

2
  • For Q1, isn't the non-Numpy analog to use k = [[0,1], [0,1]]; a[k]? In that case, you see the same behavior between a list of lists and a list of Numpy arrays. Commented Dec 15, 2017 at 6:38
  • Another way to look at it is that [a[jj] for jj in j] gives you what you're expecting in Q1. Inputting a list of lists, whether a Numpy array or not, is basically returning one set of indices at a time. Commented Dec 15, 2017 at 6:41

2 Answers 2

2

Case1, a[l] is actually a[(l,)] which expands to a[(l, slice(None))]. That is, indexing the first dimension with the list l, and an automatic trailing : slice. Indices are passed as a tuple to the array __getitem__, and extra () may be added without confusion.

Case2, a[j] is treated as a[array([0, 1]), array([0, 1]] or a[(array(([0, 1]), array([0, 1])]. In other words, as a tuple of indexing objects, one per dimension. It ends up returning a[0,0] and a[1,1].

Case3, a[j1] is a[(j1, slice(None))], applying the j1 index to just the first dimension.

Case2 is a bit of any anomaly. Your intuition is valid, but for historical reasons, this list of arrays (or list of lists) is interpreted as a tuple of arrays.

This has been discussed in other SO questions, and I think it is documented. But off hand I can't find those references.

So it's safer to use either a tuple of indexing objects, or an array. Indexing with a list has a potential ambiguity.


numpy array indexing: list index and np.array index give different result

This SO question touches on the same issue, though the clearest statement of what is happening is buried in a code link in a comment by @user2357112.

Another way of forcing the Case3 like indexing, make the 2nd dimension slice explicit, a[j,:]

In [166]: a[j]
Out[166]: array([1, 4])
In [167]: a[j,:]
Out[167]: 
array([[[1, 2],
        [3, 4]],

       [[1, 2],
        [3, 4]]])

(I often include the trailing : even if it isn't needed. It makes it clear to me, and readers, how many dimensions we are working with.)

Sign up to request clarification or add additional context in comments.

Comments

0

A1: The structure of l is not the same as j.

l is just one-dimension while j is two-dimension. If you change one of them:

# l = [0, 1]                                 # just one dimension!
l = [[0, 1], [0, 1]]                         # two dimensions
j = [np.array([0,1]), np.array([0, 1])]      # two dimensions

They have the same behave.

A2: The same, the structure of arrays in Case 2 and Case 3 are not the same.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.