How to vectorize a for loop containing if-statement in numpy?

Question

I am fairly new to numpy. I have the following code, which gives me the desired result:

import numpy as np

def get_result(S,K,delS):
    res=np.zeros(S.shape[0],dtype=np.float64)
    for j in range(res.shape[0]):
        if S[j]-K>delS:
            res[j]+=np.floor((S[j]-K)/delS)
            K+=np.floor((S[j]-K)/delS)*delS
        elif S[j]-K<-delS:
            res[j]+=np.ceil((S[j]-K)/delS)
            K+=np.ceil((S[j]-K)/delS)*delS
    return res

S=np.array([1.0,1.05,1.1,1.12,1.09,1.14,1.21,1.6,1.05,1.0,0.95,0.90,0.87,0.77,0.63,0.85,0.91,0.76],dtype=np.float64)

K=1.0
delS=0.1

l=get_result(S,K,delS)

for j in range(S.shape[0]):
    print("%d\t%.2f\t%.0f" % (j,S[j],l[j]))

The get_result function contains a for-loop, however, and is therefore awkwardly slow for larger input vectors S. Can such a function be vectorized in numpy-syntax? Any help would be appreciated.

I was first under an impression that the whole function could be vectorized in just three lines of code, but then noticed that K is changing in the loop. Since calculating K[j+1] requires the knowledge of K[j], there are dependencies between the iterations and the loop cannot be vectorized. Not at least to the best of my knowledge. — DYZ
– DYZ, Commented Dec 21, 2016 at 7:26
If you use Python 2.x replace range (returns list) by xrange (returns generator) for a significant speedup. — DerWeh
– DerWeh, Commented Dec 21, 2016 at 11:11

hpaulj · Accepted Answer · 2016-12-21 08:12:33Z

2

A general pattern when dealing with 2 or more conditions in an array or calculation, is to construct a boolean mask, and take one action for elements where that mask is true, and different where it is false:

res=np.zeros(S.shape[0],dtype=np.float64)
mask - S-K>delS
res[mask] = ...S[mask]
res[~mask] = ...S[~mask]

A variant is to identify the indices, with np.where(mask).

But what appears to complicate your calculation is that test keeps changing. That is, K for j+1 derives from calculation for j.

for j in range(res.shape[0]):
    if S[j]-K>delS:
        res[j]+=np.floor((S[j]-K)/delS)
        K+=np.floor((S[j]-K)/delS)*delS
    elif S[j]-K<-delS:
        res[j]+=np.ceil((S[j]-K)/delS)
        K+=np.ceil((S[j]-K)/delS)*delS

To deal with that kind of iteration we usually try to use np.cumsum or other accumulative ufunc methods.

As a general rule, numpy code is fastest and easiest when the calculation can be applied to all elements of an array (or set of arrays) 'in parallel' - that is, in a way that does not depend on the order of iteration. Then we can delegate the action to fast compiled numpy functions like array addition and multiplication. If the computation is serial in nature (the value for j depending on j-1) this becomes trickier.

So if my cursory reading is right, it isn't the if-statement that is difficult, it's the serial nature of the loop.

=========================

Playing around, I was able to remove the if (there are actually 3 subcase), but it still has to be iterative:

def get_result(S,K,delS):
    S = S/delS
    res=np.zeros(S.shape[0],dtype=np.float64)
    for j in range(res.shape[0]):
        x = S[j] - K/delS
        xa = np.floor(np.abs(x)) * np.sign(x)
        res[j] += xa
        K += xa*delS         
    return res

edited Dec 21, 2016 at 8:12

answered Dec 21, 2016 at 7:34

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user5507059 Over a year ago

The bottom line is that this type of function cannot be easily vectorized, especially if the dependence on previous values in the loop becomes even more intricate. For reference: I am going to cythonize my for-loop. This makes it fast and keeps the code readable.

hpaulj Over a year ago

Yes this looks like a good candidate for cython, especially if you can avoid the numpy versions of floor, ceiling, abs, etc.

Collectives™ on Stack Overflow

How to vectorize a for loop containing if-statement in numpy?

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related