I just want to check if a numpy array contains a single number quickly similar to contains for a list. Is there a concise way to do this?
a = np.array(9,2,7,0)
a.contains(0) == true
I timed some methods to do this in python 3.7:
import numpy as np
rnd = np.random.RandomState(42)
one_d = rnd.randint(100, size=10000)
n_d = rnd.randint(100, size=(10000, 10000))
searched = 42
# One dimension
%timeit if np.isin(one_d, searched, assume_unique=True).any(): pass
%timeit if np.in1d(one_d, searched, assume_unique=True).any(): pass
%timeit if searched in one_d: pass
%timeit if one_d[np.searchsorted(one_d, searched)] == searched: pass
%timeit if np.count_nonzero(one_d == searched): pass
print("------------------------------------------------------------------")
# N dimensions
%timeit if np.isin(n_d, searched, assume_unique=True).any(): pass
%timeit if np.in1d(n_d, searched, assume_unique=True).any(): pass
%timeit if searched in n_d: pass
%timeit if np.count_nonzero(n_d == searched): pass
>>> 42.8 µs ± 79.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> 38.6 µs ± 76.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> 16.4 µs ± 57.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> 4.7 µs ± 62.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> 12.1 µs ± 69.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> ------------------------------------------------------------------
>>> 239 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> 241 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> 156 ms ± 2.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> 163 ms ± 527 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
The fastest for 1d arrays is the one proposed above of np.searchsorted however it cannot be used for ndarrays. Also np.count_nonzero is the fastest but it is not much faster than the pythonic in, so I would recommend to use in.
one_d to set_one_d = set(one_d.tolist()) and then checking searched in set_one_d is so much faster, if that initial overhead is acceptable in your use case. And for small 1D arrays, the overhead of converting to a set is so low that it is even faster than np.searchsorted().
np.array(9,2,7,0)raises an error.