As far as I see it there's nothing that could speed it up with pure NumPy. However if you have numba you could write your own version of this "selection" using a jitted function:
import numba as nb
@nb.njit
def selection(a, b, c):
insert_idx = 0
for idx, item in enumerate(a):
if item > 0:
a[insert_idx] = a[idx]
b[insert_idx] = b[idx]
c[insert_idx] = c[idx]
insert_idx += 1
In my test runs this was roughly a factor 2 faster than your NumPy code. However numba might be a heavy dependency if you're not using conda.
Example:
>>> import numpy as np
>>> a = np.array([0., 1., 2., 0.])
>>> b = np.array([1., 2., 3., 4.])
>>> c = np.array([1., 2., 3., 4.])
>>> selection(a, b, c)
>>> a, b, c
(array([ 1., 2., 2., 0.]),
array([ 2., 3., 3., 4.]),
array([ 2., 3., 3., 4.]))
Timing:
It's hard to time this accuratly because all approaches work in-place, so I actually use timeit.repeat to measure the timings with a number=1 (that avoids broken timings due to the in-place-ness of the solutions) and I used the min of the resulting list of timings because that's advertised as the most useful quantitative measure in the documentation:
Note
It’s tempting to calculate mean and standard deviation from the result vector and report these. However, this is not very useful. In a typical case, the lowest value gives a lower bound for how fast your machine can run the given code snippet; higher values in the result vector are typically not caused by variability in Python’s speed, but by other processes interfering with your timing accuracy. So the min() of the result is probably the only number you should be interested in. After that, you should look at the entire vector and apply common sense rather than statistics.
Numba solution
import timeit
min(timeit.repeat("""selection(a, b, c)""",
"""import numpy as np
from __main__ import selection
a = np.arange(1000000) % 3
b = a.copy()
c = a.copy()
""", repeat=100, number=1))
0.007700118746939211
Original solution
import timeit
min(timeit.repeat("""survivors = np.where(a > 0)[0]
pos = len(survivors)
a[:pos] = a[survivors]
b[:pos] = b[survivors]
c[:pos] = c[survivors]""",
"""import numpy as np
a = np.arange(1000000) % 3
b = a.copy()
c = a.copy()
""", repeat=100, number=1))
0.028622144571883723
Alexander McFarlane's solution (now deleted)
import timeit
min(timeit.repeat("""survivors = comb_array[:, 0].nonzero()[0]
comb_array[:len(survivors)] = comb_array[survivors]""",
"""import numpy as np
a = np.arange(1000000) % 3
b = a.copy()
c = a.copy()
comb_array = np.vstack([a,b,c]).T""", repeat=100, number=1))
0.058305527038669425
So the Numba solution can actually speed this up by a factor 3-4 while the solution of Alexander McFarlane is actually slower (2x) than the original approach. However the small number of repeats may bias the timings somewhat.
bandcare you using the indices ofa!=0to index them?a[:pos]implies that everything aftera[pos:]is unused and likewise withbandcis this true? Is this a genetic algorithm where you want to keep the survivors for the next gen?a[:pos]instead of the whole a. By the way, how can I make the inline code in the comments?ahas to stay in its original shape?