1

I want to change values in one array using values from another array via a third array of indices, such as:

import numpy as np
F = np.zeros((4,3)) # array I wish to change
f = np.array([[3,4,0],[0,0,1]]) # values I wish to add to F
i = np.array([2, 2]) # indices in F I wish to affect

Lets use this data to do a += operation on F on each index i using the values in f

for id in xrange(len(i)):
    F[i[id]] += f[id]

# F[2] is now equal to np.array([ 3.,  4.,  1.]) because 
# both values in f have been correctly added to F[2]

I assumed I could do the same operation in one line like so:

F[i] += f
# F[2] is now equal to np.array([ 0.,  0.,  1.])
# i expected np.array([ 3.,  4.,  1.])

But this fails. The result I expected was np.array([ 3., 4., 1.])

If i had been a list of different indices (ex: array([0, 2])) then F[0] and F[2] would have been set to the proper items in f, but in this case I want to do a += operation, and when indices repeat I want the result to be cumulative.

Isn't there a way to do this in a simple one line operation?

3
  • Could you add your expected output? Commented Feb 19, 2016 at 23:44
  • @Cleb, third python code snippet: "# i expected np.array([ 3., 4., 1.])"... but I should have made that clearer in my question. thanks! Commented Feb 19, 2016 at 23:52
  • 1
    @Kevin in this case yes, what i want is to do the += on F[2] twice. Commented Feb 20, 2016 at 0:11

2 Answers 2

2

The operation you're looking for is numpy.add.at. Crucially, this does an unbuffered addition at the indicies specified, whereas F[i] += f uses an internal buffer.

However, ufunc.at is notorious for being non-optimal. If your arrays are sufficiently large and rectangular, it might be worth to do a small loop and use bincount. Example timings:

In [43]: n = 10**5
    ...: m = 10**6
    ...: I = np.random.randint(n, size=m)
    ...: f = np.random.rand(m, 3)

In [44]: %%time
    ...: F = np.zeros((n, 3))
    ...: np.add.at(F, I, f)
Wall time: 624 ms

In [45]: %%time
    ...: F2 = np.zeros((n, 3))
    ...: for dim in range(3):
    ...:     F2[:,dim] += np.bincount(I, f[:,dim], n)t
Wall time: 94 ms

In [46]: np.allclose(F, F2)
Out[46]: True
Sign up to request clarification or add additional context in comments.

2 Comments

Are there cases in which add.at would be preferred over the bincount solution? +1 for the latter one; looks better than what I came up with.
@Cleb - Yes: bincount with weights always returns float64, while add.at supports all numpy types. Also, add.at takes an axis argument, whereas with bincount you need to loop manually and take 1D slices. If you need to call bincount too often, relative to the size of the data, at some point add.at will be faster.
1

In this particular case (i contains only one unique number) you can avoid the for-loop by:

F[i] += sum(f)

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 3.,  4.,  1.],
       [ 0.,  0.,  0.]])

If i contains several numbers then the following would work fine:

F2 = np.zeros((4,3))
i2 = np.array([2, 3])
F2[i2] += f

Then F2 is:

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 3.,  4.,  0.],
       [ 0.,  0.,  1.]])

You could check the amount of different numbers in i using set(i) and then apply either the first or second option to F depending on the length of set(i).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.