1

I am trying to delete all rows in which there is one or less non-zero elements, in multiple 2D arrays contained within the list 'a'.

This method works when I run it outside the 'i' loop, but does not as a whole. I know that I cannot delete rows over which I am iterating, but I believe that I am not doing so in this case, because I am only deleting rows in arrays contained in a, not the arrays themselves.

for i in range(len(a)):
  del_idx=[]
  for j in range(len(a[i])):
    nonzero=np.nonzero(a[i][j])
    nonzero_len=len(nonzero[0]) #because np.nonzero outputs a tuple
    if nonzero_len<=1:
        del_idx.append(j)
    else:
        continue
  np.delete(a[i],(del_idx),axis=0)

Anyone know what's going on here? If this really does not work, how can I delete these elements without using a loop? This is Python 2.7

Thank you!

2 Answers 2

1

You should aim to avoid for loops with NumPy when vectorised operations are available. Here, for example, you can use Boolean indexing:

import numpy as np

np.random.seed(0)

A = np.random.randint(0, 2, (10, 3))

res = A[(A != 0).sum(1) > 1]

array([[0, 1, 1],
       [0, 1, 1],
       [1, 1, 1],
       [1, 1, 0],
       [1, 1, 0],
       [0, 1, 1],
       [1, 1, 0]])

The same logic can be applied for each array within your list of arrays.

Sign up to request clarification or add additional context in comments.

2 Comments

Works great, thanks. How would you do the same thing across the '0' axis? Changing .sum(1) to .sum(0) raises "boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 3"
@AlexisBL, A[:, (A != 0).sum(0) > 1]
0

You can use np.where() for indexing:

a = np.random.randint(0, 2, size=(10,10))
# array([[1, 1, 0, 0, 0, 0, 0, 1, 1, 1],
#    [1, 0, 0, 0, 1, 1, 1, 1, 0, 1],
#    [1, 0, 1, 0, 0, 1, 0, 0, 0, 1],
#    [1, 0, 0, 1, 0, 1, 0, 1, 1, 0],
#    [1, 0, 0, 0, 1, 0, 1, 1, 0, 1],
#    [0, 0, 1, 1, 1, 0, 1, 0, 0, 0],
#    [1, 0, 0, 1, 1, 0, 0, 1, 1, 0],
#    [0, 0, 0, 1, 0, 1, 0, 1, 1, 1],
#    [0, 0, 1, 1, 0, 0, 1, 0, 1, 0],
#    [1, 1, 0, 0, 0, 1, 0, 0, 1, 1]])

np.where(np.count_nonzero(a, axis=1)<5)    # In your case, should be > 1
# (array([2, 5, 8]),)

a[np.where(np.count_nonzero(a, axis=1)<5)] # Returns the array you wanted
# array([[1, 0, 1, 0, 0, 1, 0, 0, 0, 1],
#    [0, 0, 1, 1, 1, 0, 1, 0, 0, 0],
#    [0, 0, 1, 1, 0, 0, 1, 0, 1, 0]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.