I have a numpy 2d array (8000x7200). I want to count the number of cells having a value greater than a specified threshold. I tried to do this using a double loop, but it takes a lot of time. Is there a way to perform this calculation quickly?
2 Answers
Assume your variables are defined as
np.random.seed([3,1415])
a = np.random.rand(8000, 7200)
threshold = .5
Then use sum
*(a > threshold) is a boolean array indicating every instance of a cell being greater than some threshold. Since boolean values are a sub-class of int, with False as zero and True as one, we can easily sum them up. numpys sum sums over the entire array by default.
(a > threshold).sum()
28798689
4 Comments
np.count_nonzeroYour best bet is probably something like np.count_nonzero(x > threshold), where x is your 2-d array.
As the name implies, count_nonzero counts the number of elements that aren't zero. By making use of the fact that True is 1-ish, you can use it to count the number of elements that are True.