vectorized sum of array according to indices of second array [duplicate]

Question

I have an empty array:

empty = np.array([0, 0, 0, 0, 0])

an array of indices corresponding to positions in my array empty

ind = np.array([2, 3, 1, 2, 4, 2, 4, 2, 1, 1, 1, 2])

and an array of values

val = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

I want to add the values in 'val' into 'empty' according to position given by 'ind'.

The non-vectorized solution is:

for i, v in zip(ind, val): maps[i] += v
>>> maps
[ 0.  4.  5.  1.  2.]

My actual arrays are multidimensional and loooong so i've got a NEED FOR SPEED I really want a vectorized solution, or a solution that is very fast.

Note this does not work:

maps[ind] += val
>>> maps
array([ 0.,  1.,  1.,  1.,  1.])

I'd be extra grateful for a solution that works in python 2.7, 3.5, 3.6 with no hiccups

it's true it is a duplicate. but my question title is much more clear — user6794223
– user6794223, Commented Feb 9, 2017 at 16:19

Nickil Maveli · Accepted Answer · 2017-02-09 11:47:21Z

6

You can make use of np.add.at which operates equivalent to empty[ind] += val, except that results are accumulated for elements that are indexed more than once giving you a cumulated outcome for those indices.

>>> np.add.at(empty, ind, val)
>>> empty
array([0, 4, 5, 1, 2])

answered Feb 9, 2017 at 11:47

Nickil Maveli

29.8k10 gold badges86 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Daniel F · Accepted Answer · 2017-02-09 12:36:44Z

2

What you are looking for is e=np.bincount(ind, weights=val, minlength=n) where n is the length of your empty array. That way you don't have to initialize empty. You only need to do this the first time, as afterward you can do e+=np.bincount(ind, weights=val)

This is at least twice as fast as np.add.at:

%timeit np.bincount(ind, val, minlength=empty.size)
The slowest run took 12.69 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.05 µs per loop

%timeit np.add.at(empty, ind, val)
The slowest run took 2822.05 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.32 µs per loop

As for multi-dimensional indices, you can do:

np.bincount(np.ravel_multi_index(ind, empty.shape), np.ravel(val), minlength=empty.size).reshape(empty.shape)

I'm not sure how to do this with np.add.at to compare speeds

edited Feb 9, 2017 at 12:36

answered Feb 9, 2017 at 11:57

Daniel F

14.5k2 gold badges34 silver badges59 bronze badges

4 Comments

user6794223 Over a year ago

Should this work if empty and val are multidimensional? Ex: empty.shape = (5,2,2) and val.shape = (10,2,2)?

Daniel F Over a year ago

Not as written, you'd need to ravel_multi_index your indices, ravel empty and val, and reshape the end results. At that point np.add.at is probably faster, or at least more pythonic. But that's not what you asked :)

user6794223 Over a year ago

It is not what I asked, you are right. I didnt expect it would matter. But thanks!

Daniel F Over a year ago

Added an implementation with bincount. I couldn't get np.add.at to take multiple indices, do you have a working code for it?

Jan Christoph Terasa · Accepted Answer · 2017-02-09 11:59:41Z

1

This is basically a histogram, so in the one-dimensional case:

h, b = np.histogram(ind, bins=np.arange(empty.size+1), weights=val)
empty += h

Of course you can leave out the second statement in case empty only has zeros.

edited Feb 9, 2017 at 11:59

answered Feb 9, 2017 at 11:47

Jan Christoph Terasa

6,00327 silver badges35 bronze badges

1 Comment

Jan Christoph Terasa Over a year ago

I removed the part about np.bincount, because @DanielForsman already gave that answer, and I only saw after editing.

Collectives™ on Stack Overflow

vectorized sum of array according to indices of second array [duplicate]

3 Answers 3

Comments

4 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

1 Comment

Linked

Related