0

I want to create a numpy array b where each component is a 2D matrix, which dimensions are determined by the coordinates of vector a.

What I get doing the following satisfies me:

>>> a = [3,4,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> np.array(b)
array([ array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]]),
       array([[ 0.,  0.,  0.,  0.,  0.]])], dtype=object)

but if I have found this pathological case where it does not work:

>>> a = [2,1,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> b
[array([[ 0.,  0.,  0.]]), array([[ 0.,  0.]])]
>>> np.array(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (3) into shape (1)
7
  • dtype=object tells you all you need to know with the working case; this will be pretty much as a list wrapped up in numpy; there are no advantages I can think of in using numpy in this application. Commented Jan 29, 2019 at 21:10
  • In other words, the shape of this array will give you non-contiguous memory and prevent you from performing vectorized calculations etc. Commented Jan 29, 2019 at 21:13
  • Doesn't even work with dtype=object damn... Commented Jan 29, 2019 at 21:15
  • @Felix Why would it? The data type will not affect the ability to reshape Commented Jan 29, 2019 at 21:17
  • @roganjosh To my knowledge, having data type object allows for arbitrary objects to be stored in the array. Much like a list. So storing arbitrary NumPy arrays shouldn't be a problem. Curiously, taking one set of parentheses off either array does not result in the error. Commented Jan 29, 2019 at 21:18

2 Answers 2

3

I will present a solution to the problem, but do take into account what was said in the comments. Having Numpy arrays that are not aligned prevents most of the useful operations from working their magic. Consider using lists instead.

That being said, curious error indeed. I got the thing to work by assigning in a basic for-loop instead of using the np.array call.

a = [2,1,1]
b = np.zeros(len(a)-1, dtype=object)
for i in range(1, len(a)):
    b[i-1] = np.zeros((a[i], a[i - 1] + 1))

And the result:

>>> b
array([array([[0., 0., 0.]]), array([[0., 0.]])], dtype=object)
Sign up to request clarification or add additional context in comments.

5 Comments

If I initial b = np.zeros((2,1), object), b[0,:]=... produces the original error (3) into (1) error. So it appears to be doing the same fill loop, but starting with a wrong target shape.
You'd have to use thia sort of loop to make an object array with equal size arrays. In a sense your first example that works is the anomaly, not the norm :)
@hpaulj I see in the original question that the outermost list (and then array) is flat. Perhaps I just didn't get your point.
When given a list of arrays, np.array(...) can do 3 things - make a multidimensional array with a numeric dtype, make an 1d object dtype array containing these arrays, or raise an error (such as this broadcasting one). The first is the documented norm. Which of the other 2 is correct (or documented)? np.array is not a good general purpose tool for creating object dtype arrays with a desired shape and content.
@hpaulj I would hope that allowing an object array is the correct one. I've not read the documentation. And when using b[i-1, 0] instead of the colon the error is not thrown. The same works for an outer array of size (3, 2) and I imagine for any shape of the outermost array. Just assign to one specific element, not a slice.
1

This is a bit peculiar. Typically, numpy will try to create one array from the input of np.array with a common data type. A list of arrays would be interpreted with the list as being the new dimension. For instance, np.array([np.zeros(3, 1), np.zeros(3, 1)]) would produce a 2 x 3 x 1 array. So this can only happen if the arrays in your list match in shape. Otherwise, you end up with an array of arrays (with dtype=object), which as commented, is not really an ideal scenario.

However, your error seems to occur when the first dimension matches. Numpy for some reason tries to broadcast the arrays somehow and fails. I can reproduce your error even if the arrays are of higher dimension, as long as the first dimension between arrays matches.

I know this isn't a solution, but this wouldn't fit in a comment. As noted by @roganjosh, making this kind of array really gives you no benefit. You're better off sticking to a list of arrays for readability and to avoid the cost of creating these arrays.

1 Comment

Nice! Would be interesting to know why, but I that may involve extensive digging.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.