We can use a vectorized approach based on
np.core.defchararray.add for the string appending of 't:' with the valid strings, and
np.where to choose based on the conditional statement and perform the appending or just use the default value of an empty string.
So, we would have an implementation like so -
np.where(series1>series2,np.core.defchararray.add('t:',series2.astype(str)),'')
Boost it-up!
We can use the appending with np.core.defchararray.add on the valid elements based on the mask of series1>series2 to boost up the performance further after initializing an array with the default empty strings and then assigning only the valid values into it.
So, the modified version would look something like this -
mask = series1>series2
out = np.full(series1.size,'',dtype='U34')
out[mask] = np.core.defchararray.add('t:',series2[mask].astype(str))
Runtime test
Vectorized versions as functions :
def vectorized_app1(series1,series2):
mask = series1>series2
return np.where(mask,np.core.defchararray.add('t:',series2.astype(str)),'')
def vectorized_app2(series1,series2):
mask = series1>series2
out = np.full(series1.size,'',dtype='U34')
out[mask] = np.core.defchararray.add('t:',series2[mask].astype(str))
return out
Timings on a bigger dataset -
In [283]: # Setup input arrays
...: series1 = np.asarray(range(10000)).astype(float)
...: series2 = series1[::-1]
...:
In [284]: %timeit [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)]
10 loops, best of 3: 32.1 ms per loop # OP/@hpaulj's soln
In [285]: %timeit vectorized_app1(series1,series2)
10 loops, best of 3: 20.5 ms per loop
In [286]: %timeit vectorized_app2(series1,series2)
100 loops, best of 3: 10.4 ms per loop
As noted by OP in comments, that we can probably play around with the dtype for series2 before appending. So, I used U32 there to keep the output dtype same as with str dtype, i.e. series2.astype('U32') inside the np.core.defchararray.add call. The new timings for the vectorized approaches were -
In [290]: %timeit vectorized_app1(series1,series2)
10 loops, best of 3: 20.1 ms per loop
In [291]: %timeit vectorized_app2(series1,series2)
100 loops, best of 3: 10.1 ms per loop
So, there's some further marginal improvement there!
['t:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0']?