I get a PEP8 complaint about numpy.where(mask == False) where mask is a boolean array. The PEP8 recommendation comparison should be either 'if condition is false' or 'if not condition'. What is the pythonic syntax for the suggested comparison inside numpy.where()?
1 Answer
Negating a boolean mask array in NumPy is ~mask.
Also, consider whether you actually need where at all. Seemingly the most common use is some_array[np.where(some_mask)], but that's just an unnecessarily wordy and inefficient way to write some_array[some_mask].
5 Comments
hpaulj
Boolean indexing takes the same amount of time as the
where version. I think that means there's an implicit where. docs.scipy.org/doc/numpy/reference/…user2357112
@hpaulj: IIRC, for more complex cases, NumPy does call
nonzero, but for simple cases, it bypasses that and uses the boolean mask directly.user2357112
@hpaulj: See the
array_boolean_subscript code in numpy/core/src/multiarray/mapping.c. The timings I'm getting aren't what I expected, though. On some inputs, where is actually faster!user2357112
Digging into the source, I think it's because
PyArray_Nonzero and the 1D integer array index case are optimized to not use a NpyIter for the 1D case, while that optimization never made it into array_boolean_subscript. For 1D arrays and non-small proportions of True in the mask, the NpyIter overhead overwhelms the advantage of not needing to create an integer array...user2357112
while for 2D and up, or for small proportions of
True in the mask, where loses due to the overhead of creating integer index arrays. Now I want to write up a pull request. arr[where(mask)] should not be outperforming arr[mask].
numpy. Your expression looks perfectly fine to me.mask==Falseis the same as~mask, but quite different frommask is Falseornot mask.==is valid, and cannot be replaced with a pep8 compliant form. So ignore the complaint.