4

I have a numpy 2d array (8000x7200). I want to count the number of cells having a value greater than a specified threshold. I tried to do this using a double loop, but it takes a lot of time. Is there a way to perform this calculation quickly?

2 Answers 2

4

Assume your variables are defined as

np.random.seed([3,1415])
a = np.random.rand(8000, 7200)
threshold = .5

Then use sum
*(a > threshold) is a boolean array indicating every instance of a cell being greater than some threshold. Since boolean values are a sub-class of int, with False as zero and True as one, we can easily sum them up. numpys sum sums over the entire array by default.

(a > threshold).sum()
28798689
Sign up to request clarification or add additional context in comments.

4 Comments

this is almost spot on 50%
@Ev.Kounis that's by nature of the example I chose.
I know but I am still amused by it xD
Thanks a lot, it worked fine but it takes more time than np.count_nonzero
3

Your best bet is probably something like np.count_nonzero(x > threshold), where x is your 2-d array.

As the name implies, count_nonzero counts the number of elements that aren't zero. By making use of the fact that True is 1-ish, you can use it to count the number of elements that are True.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.