Set value in 2D Numpy array based on row sum

Question

Is this possible to accomplish with Numpy and with good performance?

Initial 2D array:

array([[0, 1, 1, 1, 1, 0],
       [0, 0, 1, 0, 0, 0],
       [1, 0, 0, 0, 0, 1]])

If the sum of each row is less than 4, set the last item in each row to 1:

array([[0, 1, 1, 1, 1, 0],
   [0, 0, 1, 0, 0, 1],
   [1, 0, 0, 0, 0, 1]])

Divide each item in each row with the sum of each row and get this result:

array([[0, 0.25, 0.25, 0.25, 0.25, 0],
   [0, 0, 0.5, 0, 0, 0.5],
   [0.5, 0, 0, 0, 0, 0.5]])

tel · Accepted Answer · 2018-11-27 11:06:41Z

You can do the conditional assignment in a single line with some clever boolean indexing:

arr = np.array([[0, 1, 1, 1, 1, 0],
                    [0, 0, 1, 0, 0, 0],
                    [1, 0, 0, 0, 0, 1]])

arr[arr.sum(axis=1) < 4, -1] = 1
print(arr)

Output:

[[0 1 1 1 1 0]
 [0 0 1 0 0 1]
 [1 0 0 0 0 1]]

You can then divide each row by its sum like this:

arr = arr / arr.sum(axis=1, keepdims=True)
print(arr)

Output:

[[0.   0.25 0.25 0.25 0.25 0.  ]
 [0.   0.   0.5  0.   0.   0.5 ]
 [0.5  0.   0.   0.   0.   0.5 ]]

Explanation

Let's give the boolean index array arr.sum(axis=1) >= 4 the name boolix. boolix looks like:

[ True False False]

If you slice arr with boolix, it will return an array with all of the rows of arr for which the corresponding value in boolix is True. So the result of arr[boolix] is an array with the 1st and 2nd rows of arr:

[[0 0 1 0 0 0]
 [1 0 0 0 0 1]]

In the code above, arr was sliced as arr[boolix, -1]. Adding a second index to the slice arr[anything, -1] makes the slice contain only the last value in each row (ie the value in the last column). So the arr[boolix, -1] will return:

[0 1]

Since these slices can also be assigned to, assigning 1 to the slice arr[boolix, -1] solves your problem.

b-fg · Accepted Answer · 2018-11-27 11:03:17Z

1

~~numpy.where can also be useful here to find the rows matching your condition~~:

import numpy as np
a = np.array([[0, 1, 1, 1, 1, 0],
              [0, 0, 1, 0, 0, 0],
              [1, 0, 0, 0, 0, 1]])

a[np.sum(a,axis=1) < 4, -1] = 1
a = a/a.sum(axis=1)[:,None]

print(a)

# Output 
# [[0.   0.25 0.25 0.25 0.25 0.  ]
#  [0.   0.   0.5  0.   0.   0.5 ]
#  [0.5  0.   0.   0.   0.   0.5 ]]

PS: Edited after @tel suggestion :)

edited Nov 27, 2018 at 11:03

answered Nov 27, 2018 at 10:51

b-fg

4,1972 gold badges31 silver badges45 bronze badges

2 Comments

tel Over a year ago

Combining the row and column slices with [x, -1] is a nice idea. However, the np.where is completely pointless. You can remove it (and the extra work it's doing) and you get the same effect.

b-fg Over a year ago

Oh right! For some reason I omited this. Thanks for pointing it out. I will edit my answer.

Sociopath · Accepted Answer · 2018-11-27 10:37:22Z

0

I think you need:

x = np.array([[0, 1, 1, 1, 1, 0],
   [0, 0, 1, 0, 0, 0],
   [1, 0, 0, 0, 0, 1]])

x[:,-1][x.sum(axis=1) < 4] = 1
# array([[0, 1, 1, 1, 1, 0],
#   [0, 0, 1, 0, 0, 1],
#  [1, 0, 0, 0, 0, 1]])

print(x/x.sum(axis=1)[:,None])

Output:

array([[0.  , 0.25, 0.25, 0.25, 0.25, 0.  ],
       [0.  , 0.  , 0.5 , 0.  , 0.  , 0.5 ],
       [0.5 , 0.  , 0.  , 0.  , 0.  , 0.5 ]])

answered Nov 27, 2018 at 10:37

Sociopath

13.4k22 gold badges53 silver badges82 bronze badges

2 Comments

Nils Werner Over a year ago

Indexing twice (e.g. x[a][b] instead of x[a, b]) is usually a bad idea, as it may have unintended consequences (e.g. sometimes you can assign values this way, sometimes you can't)

tel Over a year ago

@NilsWerner That is a good point that hadn't occurred to me when I was writing my answer (which did originally use x[a][b]). The issue is that while x[a][b] will return a view in many cases, sometimes it does return a copy instead, right?

Collectives™ on Stack Overflow

Set value in 2D Numpy array based on row sum

3 Answers 3

Explanation

Comments

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Explanation

Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related