4

Let's create a large np array 'a' with 10,000 entries

import numpy as np
a = np.arange(0, 10000)

Let's slice the array with 'n' indices 0->9, 1->10, 2->11, etc.

n = 32
b = list(map(lambda x:np.arange(x, x+10), np.arange(0, n)))
c = a[b]

The weird thing that I am getting, is that if n is smaller than 32, I get an error "IndexError: too many indices for array". If n is bigger or equal than 32, then the code works perfectly. The error occurs regardless of the size of the initial array, or the size of the individual slices, but always with number 32. Note that if n == 1, the code works.

Any idea on what is causing this? Thank you.

2
  • What are you trying to do with your map? I will give [x...x+10) for a in [0...), i.e. [0,1,2,3,4,5,6,7,8,9] then [1,2,3,4,5,6,7,8,9, 10], then [2,3,4,5,6,7,8,9, 10, 11] ... which probably isn't what you meant. Commented Feb 28, 2019 at 17:00
  • Hi doctorlove, it really does not matter what I am trying to do with the map. I have changed the description of the code above. The real issue is with the error I get when n < 32. Commented Feb 28, 2019 at 17:10

2 Answers 2

2

Your b is a list of arrays:

In [84]: b = list(map(lambda x:np.arange(x, x+10), np.arange(0, 5)))            
In [85]: b                                                                      
Out[85]: 
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]),
 array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11]),
 array([ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),
 array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13])]

When used as an index:

In [86]: np.arange(1000)[b]                                                     
/usr/local/bin/ipython3:1: FutureWarning: Using a non-tuple sequence for multidimensional 
indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. 
In the future this will be interpreted as an array index, `arr[np.array(seq)]`, 
which will result either in an error or a different result.
  #!/usr/bin/python3
---------------------------------------------------------------
IndexError: too many indices for array

A[1,2,3] is the same as A[(1,2,3)] - that is, the comma separated indices are a tuple, which is then passed on to the indexing function. Or to put it another way, a multidimensional index should be a tuple (that includes ones with slices).

Up to now numpy has been a bit sloppy, and allowed us to use a list of indices in the same way. The warning tells us that the developers are in the process of tightening up those restrictions.

The error means it is trying to interpret each array in your list as the index for a separate dimension. An array can have at most 32 dimensions. Evidently for the longer list it doesn't try to treat it as a tuple, and instead creates a 2d array for indexing.

There are various ways we can use your b to index a 1d array:

In [87]: np.arange(1000)[np.hstack(b)]                                          
Out[87]: 
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  1,  2,  3,  4,  5,  6,  7,
        8,  9, 10,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,  3,  4,  5,  6,
        7,  8,  9, 10, 11, 12,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13])

In [89]: np.arange(1000)[np.array(b)]    # or np.vstack(b)                                       
Out[89]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13]])

In [90]: np.arange(1000)[b,]             # 1d tuple containing b                                       
Out[90]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13]])

Note that if b is a ragged list - one or more of the arrays is shorter, only the hstack version works.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. I was also playing around and np.array(b) does make it work.
2

First of all, you're not slicing 0->9, 10->19, 20->29; your slices advance by 1 only: 0->9, 1->10, 11->20. Instead, try this:

n = 32
size = 10
b = list(map(lambda x:np.arange(x, x+size), np.arange(0, n*size, size)))

Next, you've misused the indexing notation. b is a list of arrays, and you've used this entire list to index a. When you have indexed more elements than exist in a, numpy assumes that you want the complex list taken as a sequence of references, and uses them as individual index arrays, one a element per leaf element in b.

However, once you drop below the limit of len(a), then numpy assume that you're trying to give a multi-dimensional slice into a: each element of b is taken as a slice into the corresponding dimension of a. Since a is only 1-dimensional, you get the error message. Your code will run in this mode with n=1, but fails with n=2 and above.

Although your question isn't a duplicate, also please see this one.

6 Comments

Hi Prune thanks for your answer. You are right. The slicing 0->9, 10-19 was just something I chose randomly. The real problem is the weird error I get when n < 32.
Do you understand the error message? I don't with n smaller, there are fewer indices, in b. What's the problem mean?
The code fails with n in the range 2-31 because b is then small enough to be interpreted as a multi-dimensional slice; that interpretation takes precedence. See the linked question for details. When b is 32 or larger, the only legal interpretation is as a sequence of individual requests.
Got yout point. But then why does it succeed with n = 32 and above?
See the comment just above yours.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.