Suppose I have a matrix that is 100000 x 100
import numpy as np
mat = np.random.randint(2, size=(100000,100))
I wish to go through this matrix, and if each row contains entirely either 1 or 0 I wish to change a state variable to that value. If the state is not changed, I wish to set the entire row the value of state. The initial value of state is 0.
Naively in a for loop this can be done as follows
state = 0
for row in mat:
if set(row) == {1}:
state = 1
elif set(row) == {0}:
state = 0
else:
row[:] = state
However, when the size of the matrix increases this takes an impractical amount of time. Could someone point me in the direction in how to leverage numpy to vectorize this loop and speed it up?
So for a sample input
array([[0, 1, 0],
[0, 0, 1],
[1, 1, 1],
[0, 0, 1],
[0, 0, 1]])
The expected output in this case would be
array([[0, 0, 0],
[0, 0, 0],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
mat==1tests where there are 1s.(mat==1).all(axis=1)tests whether the rows are all 1st. That boolean array can be used to select rows.