1

I have a numpy.array of numpy arrays of different shapes. When I call np.sum(my_array) I get this error:

Traceback (most recent call last):
return umr_sum(a, axis, dtype, out, keepdims)
ValueError: operands could not be broadcast together with shapes (13,5) (5,3)

All I want is sum of all values across all arrays like sum(my_array) = some float number

Is there some parameter that I missed or another method? I can only think of something like this

np.sum([np.sum(a) for a in my_array])

Is this an optimal way?

Update:

print(type(my_array))
print((my_array).shape)
print([(type(sub_array), sub_array.shape) for sub_array in my_array])

output:

<class 'numpy.ndarray'>
(2,)
[(<class 'numpy.ndarray'>, (13, 5)), (<class 'numpy.ndarray'>, (5, 3))]
5
  • What is my_array? Is it a python list containing numpy arrays? Commented Nov 6, 2014 at 21:30
  • @jozzas It's also a numpy.array Commented Nov 6, 2014 at 21:32
  • What kind of ndarray is this that contains other ndarrays of varying shapes? Please print the results of type(my_array) for us. Commented Nov 6, 2014 at 21:35
  • @ballsdotballs I've updated the post above with an output. Commented Nov 6, 2014 at 21:44
  • 1
    TIL that you can have ndarrays with dtype 'object' whose elements can be arbitrary shapes and types. Commented Nov 6, 2014 at 21:57

2 Answers 2

4

Using a generator should be better in most cases:

np.sum(np.sum(a) for a in my_array)

Without the '[ ... ]' you don't create a list.

%timeit np.sum( np.sum(a) for a in my_array )

100000 loops, best of 3: 5.73 µs per loop

%timeit np.sum( [np.sum(a) for a in my_array] )

100000 loops, best of 3: 9.97 µs per loop

Sign up to request clarification or add additional context in comments.

2 Comments

Wow. Didn't know that I can omit '[..]' thanks. How do you make this %timeit? I've tried it but couldn't make it work like yours. Please, can you tell or post a link?
I wrote this in a IPython Notebook. You should take a look at that, it is included when installing python with anaconda or enthougt. %timeit can be used there.
0
a = np.array(map(np.arange, range(16, 32)))

Eyy![28]: %timeit np.sum(map(np.sum, a))
10000 loops, best of 3: 90.5 µs per loop

Eyy![29]: %timeit np.sum(np.sum(b) for b in a)
10000 loops, best of 3: 86 µs per loop

Eyy![30]: %timeit np.sum([np.sum(b) for b in a])
10000 loops, best of 3: 90.2 µs per loop

also note that many times it can be easier to just have zeropadded numpy arrays if you know the maximum size in advance.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.