Vectorization/optimising for loop with numpy in Python

Question

Im writing a script to handle some data from a sensor represented in the signal_gen function. As you can see in the testing function it is quite loop sentered. Since this function is called many times it makes it a bit slow and it would be lovely with a push in the right direction for optimising it.

I have read that it is possible to exchange the for loop with a vectorizatid array, but I can't get my head around how the i_avg[i] line should be written, since we have single element y[i] multiplied with the whole array x inside a np.cos, and all this is again just one irritation of i_avg.

def testing(signal):
    y = np.arange(0.0108, 0.0135, 0.001) # this one changes over time, set 
    #to constant for easier reading
    x = np.arange(0, (len(signal)))
    I_avg = np.zeros(len(y))
    Q_avg = np.zeros_like(I_avg)
    for i in range(0, len(y)):
        I_avg[i] = np.array(signal * (np.cos(2 * np.pi * y[i] * x))).sum()
        Q_avg[i] = np.array(signal * (np.sin(2 * np.pi * y[i] * x))).sum()
    D = np.power(I_avg, 2) + np.power(Q_avg, 2)
    max_index = np.argmax(D)
    phaseOut = np.arctan2(Q_avg[max_index], I_avg[max_index])

#just a test signal
def signal_gen():
    signal = np.random.random(size=251)
    return signal

I did, but I guess I was sloppy when i changed a couple of variable names — Runsiv
– Runsiv, Commented Mar 13, 2017 at 9:16

Divakar · Accepted Answer · 2017-03-13 09:34:45Z

1

One vectorized approach using matrix-multiplication with numpy.dot to replace the nested loop to give us I_avg, Q_avg and also incorporating NumPy broadcasting and thus achieve a more efficient solution would be like so -

mult = 2*np.pi*y[:,None]*x
I_avg, Q_avg = np.cos(mult).dot(signal), np.sin(mult).dot(signal)

Please note that for the given sample, we are competing against a loopy version that only has to iterate for 3 iterations (y being of length 3). As such, we won't be seeing a huge speedup here.

Runtime test -

In [9]: #just a test signal
   ...: signal = np.random.random(size=251)
   ...: y = np.arange(0.0108, 0.0135, 0.001)
   ...: x = np.arange(0, (len(signal)))
   ...: 

# Original approach
In [10]: %%timeit I_avg = np.zeros(len(y))
    ...: Q_avg = np.zeros_like(I_avg)
    ...: for i in range(0, len(y)):
    ...:     I_avg[i] = np.array(signal * (np.cos(2 * np.pi * y[i] * x))).sum()
    ...:     Q_avg[i] = np.array(signal * (np.sin(2 * np.pi * y[i] * x))).sum()
    ...: 
10000 loops, best of 3: 68 µs per loop

# Proposed approach
In [11]: %%timeit mult = 2*np.pi*y[:,None]*x
    ...: I_avg, Q_avg = np.cos(mult).dot(signal), np.sin(mult).dot(signal)
    ...: 
10000 loops, best of 3: 34.8 µs per loop

edited Mar 13, 2017 at 9:34

answered Mar 13, 2017 at 9:14

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

kmario23 Over a year ago

we could also use np.newaxis instead of hard-coding None there. Just a suggestion, although both are essentially the same

Ohjeah · Accepted Answer · 2017-03-13 09:28:15Z

0

You can use np.einsum for broadcasting:

yx = 2*np.pi*np.einsum("i,j->ij", y, x)
I_avg = np.sin(yx) @ signal    
Q_avg = np.cos(yx) @ signal

answered Mar 13, 2017 at 9:28

Ohjeah

1,30718 silver badges25 bronze badges

Collectives™ on Stack Overflow

Vectorization/optimising for loop with numpy in Python

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related