1

I've got a sparse, symmetric array and I'm trying to delete a row and column of that array if all the individual entries of a given row (and column) do not satisfy some threshold condition. For example if

min_value = 2
a = np.array([[2, 2, 1, 0, 0], 
              [2, 0, 1, 4, 0], 
              [1, 1, 0, 0, 1], 
              [0, 4, 0, 1, 0], 
              [0, 0, 1, 0, 0]])

I would like to keep the rows (and columns) where the it has at least a value of 2 or more, so that with the above example this would yield

a_new = np.array([2, 2, 0],
                 [2, 0, 4], 
                 [0, 4, 1]] 

So I would lose rows 3 and 5 (and columns 3 and 5) since every entry is less then 2. I've had a look at How could I remove the rows of an array if one of the elements of the row does not satisfy a condition?, Delete columns based on repeat value in one row in numpy array and Delete a column in a multi-dimensional array if all elements in that column satisfy a condition but the marked solutions do not fit what I'm attempting to accomplish.

I was thinking of performing something similar to:

a_new = []
min_count = 2

for row in a:
    for i in row:
        if i >= min_count:
            a_new.append(row)
    print(items)
print(temp)

but this doesn't work since it doesn't delete a bad column and if there are two (or more) instances where a value is greater then the threshold it append a row multiple times.

1 Answer 1

1

You could have a vectorized solution to solve it as shown below -

# Get valid mask
mask = a >= min_value

# As per requirements, look for ANY match along rows and cols and 
# use those masks to index into row and col dim of input array with
# 1D open meshes from np.ix_ and thus select a 2D slice out of it
out = a[np.ix_(mask.any(1),mask.any(0))]

A simpler way to express it would be by selecting rows and then columns, like so -

a[mask.any(1)][:,mask.any(0)]

Abusing the symmetric nature of the input array, it would simplify to -

mask0 = (a>=min_value).any(0)
out = a[np.ix_(mask0,mask0)]

Sample run -

In [488]: a
Out[488]: 
array([[2, 2, 1, 0, 0],
       [2, 0, 1, 4, 0],
       [1, 1, 0, 0, 1],
       [0, 4, 0, 1, 0],
       [0, 0, 1, 0, 0]])

In [489]: min_value
Out[489]: 2

In [490]: mask0 = (a>=min_value).any(0)

In [491]: a[np.ix_(mask0,mask0)]
Out[491]: 
array([[2, 2, 0],
       [2, 0, 4],
       [0, 4, 1]])

Alternatively, we can use row and column indices of valid mask, like so -

r,c = np.where(a>=min_value)
out = a[np.unique(r)[:,None],np.unique(c)]

Again abusing the symmetric nature, the simplified version would be -

r = np.unique(np.where(a>=min_value)[0])
out = a[np.ix_(r,r)]

r could also be obtained with a mix of boolean operations -

r = np.flatnonzero((a>=min_value).any(0))
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you very much, this far more eloquent than I expected. This isn't that important, but is there a way to generate a list of deleted rows/and columns?
@Lukasz np.flatnonzero(~((a>=min_value).any(0)))?
that generates an empty list. It's not terribly important for me to generate this list, although it would have been nice especially for larger matrices.
@Lukasz Are you sure about it? Can you run again and check if its empty, as I might have made edits in that comment and you might have picked up the wrong one?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.