2

I have an array a of ones and zeroes (it might be rather big)

a = np.array([[1, 0, 0, 1, 0, 0],
              [1, 1, 0, 0, 1, 0],
              [0, 1, 1, 0, 0, 1],
              [0, 0, 0, 1, 1, 1])

in which the "upper" rows are more "important" in the sense that if there is 1 in any column of the i-th row, then all ones in that columns in the following rows must be zeroed.

So, the desired output should be:

array([[1, 0, 0, 1, 0, 0],
       [0, 1, 0, 0, 1, 0],
       [0, 0, 1, 0, 0, 1],
       [0, 0, 0, 0, 0, 0]])

In other words, there should only be single 1 per column.

I'm looking for a more numpy way to do this (i.e. minimising or, better, avoiding the loops).

2
  • To clarify , do you mean the first one in each column should be made 1 and all else should be 0 Commented May 5, 2021 at 8:49
  • Yes, the first (from "above") 1 in the given column must be kept while all the following zeroed. Commented May 5, 2021 at 8:58

3 Answers 3

2

Your array:

     [[1, 0, 0, 1, 0, 0],
      [1, 1, 0, 0, 1, 0],
      [0, 1, 1, 0, 0, 1],
      [0, 0, 0, 1, 1, 1]]

Transpose it with numpy:

a = np.transpose(your_array)

Now it looks like this:

  [[1, 1, 0, 0],
   [0, 1, 1, 0],
   [0, 0, 1, 0],
   [1, 0, 0, 1],
   [0, 1, 0, 1],
   [0, 0, 1, 1]]

Zero all the non-zero (and "not upper") elements row wise:

 res = np.zeros(a.shape, dtype="int64")
 idx =  np.arange(res.shape[0])
 args = a.astype(bool).argmax(1)
 res[idx, args] = a[idx, args]
 

The output of res is this:

 #### Output
  [[1, 0, 0, 0],
   [0, 1, 0, 0],
   [0, 0, 1, 0],
   [1, 0, 0, 0],
   [0, 1, 0, 0],
   [0, 0, 1, 0]]

Re-transpose your array:

a = np.transpose(res)

  [[1, 0, 0, 1, 0, 0],
   [0, 1, 0, 0, 1, 0],
   [0, 0, 1, 0, 0, 1],
   [0, 0, 0, 0, 0, 0]])

EDIT: Thanks to @The.B for the tip

Sign up to request clarification or add additional context in comments.

3 Comments

you have floats in your final array. dtype="int64" should come in handy.
Instead of using np.transpose, we can also chose to use a.T and res.T instead.
That works, thanks! Actually, you don't need np.transpose at all: rows = a.argmax(axis=0), cols = np.arange(a.shape[1]). I.e. just flip axes.
2

An alternative solution is to do a forward fill followed by the cumulative sum and then replace all values which are not 1 with 0:

a = np.array([[1, 0, 0, 1, 0, 0],
              [1, 1, 0, 0, 1, 0],
              [0, 1, 1, 0, 0, 1],
              [0, 0, 0, 1, 1, 1]])

ff = np.maximum.accumulate(a, axis=0)
cs = np.cumsum(ff, axis=0)
cs[cs > 1] = 0

Output in cs:

array([[1, 0, 0, 1, 0, 0],
       [0, 1, 0, 0, 1, 0],
       [0, 0, 1, 0, 0, 1],
       [0, 0, 0, 0, 0, 0]])

EDIT

This will do the same thing and should be slightly more efficient:

ff = np.maximum.accumulate(a, axis=0)
ff ^ np.pad(ff, ((1,0), (0,0)))[:-1]

Output:

array([[1, 0, 0, 1, 0, 0],
       [0, 1, 0, 0, 1, 0],
       [0, 0, 1, 0, 0, 1],
       [0, 0, 0, 0, 0, 0]])

And if you want to do the operations in-place to avoid temporary memory allocation:

out = np.zeros((a.shape[0]+1, a.shape[1]), dtype=a.dtype)
np.maximum.accumulate(a, axis=0, out=out[1:])
out[:-1] ^ out[1:]

Output:

array([[1, 0, 0, 1, 0, 0],
       [0, 1, 0, 0, 1, 0],
       [0, 0, 1, 0, 0, 1],
       [0, 0, 0, 0, 0, 0]])

Comments

0

You can traverse through each column of array and check if it is the first one -

If Not: Make it 0

for col in a.T:
  f=0
  for x in col:
    if(x==1 and f==0):
      f=1
    else:
      x=0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.