Efficient conversion of a 3D numpy array to a 1D numpy array

Question

I have a 3D numpy array in this form:

>>>img.shape
(4504932, 2, 2)

>>> img
array([[[15114, 15306],
    [15305, 15304]],

   [[15305, 15306],
    [15303, 15304]],

   [[15305, 15306],
    [15303, 15304]],

   ..., 

   [[15305, 15302],
    [15305, 15302]]], dtype=uint16)

Which I want to convert to a 1D numpy array where each entry is the sum of each 2x2 submatrix in the above img numpy array.

I have been able to accomplish this using:

img_new = np.array([i.sum() for i in img])
>>> img_new
array([61029, 61218, 61218, ..., 61214, 61214, 61214], dtype=uint64)

Which is exactly what I want. But this is too slow (takes about 10 seconds). Is there a faster method I could use? I included above img.shape so you had an idea of the size of this numpy array.

EDIT - ADDITIONAL INFO: My img array could also be a 3D array in the form of 4x4, 5x5, 7x7.. etc submatrices. This is specified by the variables sub_rows and sub_cols.

user2357112 · Accepted Answer · 2015-06-16 22:54:19Z

4

img.sum(axis=(1, 2))

sum allows you to specify an axis or axes along which to sum, rather than just summing the whole array. This allows NumPy to loop over the array in C and perform just a few machine instructions per sum, rather than having to go through the Python bytecode evaluation loop and create a ton of wrapper objects to stick in a list.

edited Jun 16, 2015 at 22:54

answered Jun 16, 2015 at 22:49

user2357112

286k32 gold badges490 silver badges569 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

bazzoli92 Over a year ago

That was super fast. Thanks! I am adding now an additional piece of info in my question. Could you comment to let me know if this would still apply.

user2357112 Over a year ago

@user3250673: img.sum(axis=(1, 2)) still works. It has no dependence on the size of any axes.

bazzoli92 Over a year ago

Thank you all for the help.

bazzoli92 Over a year ago

Just realized that I have a small issue. My img array is of dtype=uint16 so I cnanot use directly img.sum(axis=(1,2)). Would using astype set to int32` be a bad idea?

user2357112 Over a year ago

@user3250673: There's nothing stopping you from using sum on an array of dtype uint16, but if you're worried about it overflowing, sum has a dtype parameter. This sets the dtype of the accumulator used and of the resulting array. However, note that when the dtype of an array has size less than the default platform integer, NumPy will already autopromote the sum to the default platform integer size, so you may already be getting a uint32 or uint64 instead of a uint16.

|

cr1msonB1ade · Accepted Answer · 2015-06-16 22:55:22Z

0

Using a numpy method (apply_over_axes) is usually quicker and indeed that is the case here. I just tested on a 4000x2x2 array:

img = np.random.rand(4000,2,2)
timeit(np.apply_along_axis(np.sum, img, [1,2]))
# 1000 loops, best of 3: 721 us per loop
timeit(np.array([i.sum() for i in img]))
# 100 loops, best of 3: 17.2 ms per loop

answered Jun 16, 2015 at 22:55

cr1msonB1ade

1,73610 silver badges15 bronze badges

2 Comments

bazzoli92 Over a year ago

Quicker than using img.sum(axis=(..,..))?

cr1msonB1ade Over a year ago

Nope:( And slower to the answer too. timeit(img.sum(axis=(1,2))) # 10000 loops, best of 3: 51.8 us per loop

Divakar · Accepted Answer · 2015-06-17 05:02:05Z

0

You can use np.einsum -

img_new = np.einsum('ijk->i',img)

Verify results

In [42]: np.array_equal(np.array([i.sum() for i in img]),np.einsum('ijk->i',img))
Out[42]: True

Runtime tests

In [34]: img = np.random.randint(0,10000,(10000,2,2)).astype('uint16')

In [35]: %timeit np.array([i.sum() for i in img]) # Original approach
10 loops, best of 3: 92.4 ms per loop

In [36]: %timeit img.sum(axis=(1, 2)) # From other solution
1000 loops, best of 3: 297 µs per loop

In [37]: %timeit np.einsum('ijk->i',img)
10000 loops, best of 3: 102 µs per loop

answered Jun 17, 2015 at 5:02

Divakar

222k19 gold badges273 silver badges374 bronze badges

Collectives™ on Stack Overflow

Efficient conversion of a 3D numpy array to a 1D numpy array

3 Answers 3

7 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related