-4

Today I wanted to use the following command to subset a numpy high-dimensional array, but was surprised to find that the two methods were completely different. I'm very curious why numpy reorders the array in the first method. Please tell me

>>> x = np.arange(0,36).reshape(3, 3, 4)
>>> x
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]],

       [[24, 25, 26, 27],
        [28, 29, 30, 31],
        [32, 33, 34, 35]]])
>>> x[0,0:2,[1,3]]
array([[1, 5],
       [3, 7]])
>>> x[0][0:2,[1,3]]
array([[1, 3],
       [5, 7]])
4
  • 6
    Welcome to SO! please put your code into blocks surrounded by ``` so it can be copied/searched/discovered, etc.! meta.stackoverflow.com/questions/285551/… Commented Oct 19 at 5:22
  • 3
    Yes, your question needs proper formating. But to save other viewers the effort of viewing your images, I'll note that x[0,0:2,[1,3]] is a well documented, and discussed case of mixed basic and advanced indexing. A slice in the middle often transposes dimensions. Commented Oct 19 at 20:36
  • @ti7 "Don't: ... Transcribe code from an image to text" Commented Oct 28 at 22:27
  • @KellyBundy I absolutely agree in general and Askers should be helped to make good Questions! still, I think it's practical to do it extremely rarely.. neither the Asker fixed it, nor has the Question been closed after many days of sitting, indicating to me that it's only the structure that's bad (really shame on 5 downvoters+me not also using the close system, perhaps this should have gone away for details/clarity or duplicate?..) - I think the Question is genuinely a little interesting, if contrived, and hpaulj gave most of the Answer already, so it should be made useful and saved Commented Oct 29 at 2:57

1 Answer 1

0

What's happening might be clearer with an even smaller or less-contrived example; as @hpaulj notes, you're running afoul of Combining Advanced and Basic indexing

This wasn't obvious to me when I started reading into this, but there are actually quite a variety of indexing styles based on how Basic and Advanced are used together

https://numpy.org/doc/stable/user/basics.indexing.html

Category General Identification1
Basic slice(+stride) (and ... wildcard count of empty slices :) or single elements without slice, returns a view
Advanced "triggered when the selection object, obj, is a non-tuple sequence object, an ndarray [..], or a tuple with at least one sequence object or ndarray [..]"2, returns a copy of the data
Basic + Advanced slice before advanced indexing
Advanced + Basic advanced indexing before slice
Combined advanced indexing sandwiching slicing (your case) "When there is at least one slice (:), ellipsis (...) or newaxis in the index (or the array has more dimensions than there are advanced indices), then the behaviour can be more complicated. It is like concatenating the indexing result for each advanced index element."

1. table does not attempt to give a 100% complete identification, but to help understand and summarize the indexing styles described in the docs and below
2. supported types note removed "(of data type integer or bool)"

When you index first like arr[0], then the first operation then begins with a slice, making a view of whatever arr[0] is, with indicies relative to it, maintaining their positions relative to the original array .. however, when you use both advanced and basic indexing, the successive result effectively transposes the array

Practically, I suspect the general wisdom is "don't do that" unless you have to and instead perform operations in multiple steps to avoid bizarre-seeming transforms .. then if you need to improve the speed, document your research and carefully go cut those corners (Numba?)!

Using the example from the NumPy docs, you can see how these different indexing strategies appear to give the same result, but are actually backed a little differently (note that perhaps further counterintuitively "slicing" is under the umbrella of "basic indexing", while "integer array indexing" lives under "advanced indexing")

>>> x = np.array([[ 0,  1,  2],
...               [ 3,  4,  5],
...               [ 6,  7,  8],
...               [ 9, 10, 11]])

Basic Indexing only

>>> x[1:2, 1:3]          # [x[1,1], x[1,2]]   basic indexing (slice)
array([[4, 5]])
>>> x[1:2, 1:3].base     # view based on entire original array
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

Basic Indexing + Advanced Indexing

>>> x[1:2, [1, 2]]       # basic + advanced
array([[4, 5]])
>>> x[1:2, [1, 2]].base  # transposed view of slice?
array([[4],
       [5]])
>>> x[1:2][...,[1,2]]    # clearer view with basic,advanced separated
array([[4, 5]])
>>> x[1:2][...,[1,2]].base
array([[4],
       [5]])

Advanced Indexing + Basic Indexing

>>> x[[1,2], 2:]
array([[5],
       [8]])
>>> x[[1,2], 2:].base is None  # new copy
True

Advanced Indexing only

>>> x[[0+1,1],[0+1,2]]                   # first index offset from slice
array([4, 5])
>>> np.array([x[1:2][0,1],x[1:2][0,2]])  # slice broken out
array([4, 5])

Now expanding the basic+advanced indexing example to slice more of the first dimension it might be clearer to see how the resulting array is really just the cartesian product of the relevant indicies, while mixing advanced and basic indexing can transpose the result

>>> x[1:, [1,2]]
array([[ 4,  5],
       [ 7,  8],
       [10, 11]])
>>> from itertools import product
>>> slice1_of_index0 = range(1,x.shape[0])  # implement `1:`
>>> list(product(slice1_of_index0, [1,2]))
[(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)]
>>> np.array([[x[1,1],x[1,2]],[x[2,1],x[2,2]],[x[3,1],x[3,2]]])
array([[ 4,  5],
       [ 7,  8],
       [10, 11]])

Now the Combined Indexing transpose you see can be triggered by adding another dimension which enables sandwiching the Basic Indexing!

>>> y = x.reshape(1,*x.shape)
>>> y
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]]])
>>> x.shape, y.shape
((4, 3), (1, 4, 3))
>>> np.array_equal(x, x)     # expected
False
>>> np.array_equal(x, y[0])  # x really is the same as y[0]
True
>>> y[0][1:, [1,2]]    # just the same as x[1:, [1,2]]
array([[ 4,  5],
       [ 7,  8],
       [10, 11]])
>>> y[0, 1:, [1,2]]    # result becomes transposed due to combined indexing
array([[ 4,  7, 10],
       [ 5,  8, 11]])
>>> y[[0], 1:, [1,2]]  # same again, single index is shorthand
array([[ 4,  7, 10],
       [ 5,  8, 11]])

The behavior is actually explained by the NumPy indexing docs, but I feel is difficult to understand without more context than is provided

Two cases of index combination need to be distinguished:

  • The advanced indices are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2].
  • The advanced indices are all next to each other. For example x[..., arr1, arr2, :] but not x[arr1, :, 1] since 1 is an advanced index in this regard.

In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).

from https://numpy.org/doc/stable/user/basics.indexing.html#combining-advanced-and-basic-indexing (emphasis mine), also directly given here numpy indexing with conditional transposes dimensions


Further Reading

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.