Speed up numpy calculation in nested loop

Question

Here're some snippet of code.

def rbf(r):
    rsq = r ** 2.0
    if rsq == 0.0:
        return 0.0
    val = rsq * np.log(rsq)
    if np.isnan(val):
        return 0.0
    return val

for i in range(449):
    for j, row in enumerate(p):
        res[j] += w[i] * rbf(r=np.linalg.norm(p[i] - row))
return res

Here p represents an array with shape (4448^2, 3), res represents an array with shape (4448^2, ). This process of in-place addition cosumes too much time. I tried another way as following.

def rbf(r):
    rsq = r ** 2.0
    if rsq == 0.0:
        return 0.0
    val = rsq * np.log(rsq)
    if np.isnan(val):
        return 0.0
    return val

def func(elem, row):
    summand = sum([w[j]
                  * rbf(r=np.linalg.norm(p[j] - row))
                  for j in range(449)])
    return summand + elem

dst_res = np.array([func(elem=elem, row=row)
                    for elem, row in zip(res, p)])
return dst_res

But I still failed to see any progress. Any advice for improving performance ?

What is w? Can you express this as a pure function with inputs (w, p?) and outputs? Why 449? What does that have to do with 4448^2? Please provide a proper MCVE that can be run in the terminal with no additional typing. — Mateen Ulhaq
– Mateen Ulhaq, Commented Aug 23, 2020 at 1:20

Duda · Accepted Answer · 2020-08-27 21:58:53Z

There are several ways to speed up your program.

At first I would restructure the computational function: avoid ifs or try to put them in the outermost loop, avoid computational intense functions such as power (**).

def rbf(r):
    #if (r==0) or (np.isinf(r)) or (np.isnan(r)):
        #return 0.0
        # NaN-Test is forwarded to the outer loop; compare computational times!
    rsq = r * r
    return rsq * np.log(rsq)

second: additions are comparatively fast, not much to gain by improving this. You should maybe try to vectorize your data to speed up the multiplication:

tmp = np.zeros(449) # store values in temporary vector 
for j, row in enumerate(p): # change order of the loop 
    for i in range(449):
        tmp[i] = rbf(r=np.linalg.norm(p[i] - row)) # fill vector
    #the NaN-Check should be actually made here: measure both scenarios
    tmp[np.isnan(tmp)] = 0.0
    
    res[j] = sum( w * tmp)
return res

you can reduce everything further:

tmp = np.zeros(449) # store values in temporary vector 
for j, row in enumerate(p): # change order of the loop 
    for i in range(449):
        tmp[i] = rbf(r=np.linalg.norm(p[i] - row)) # fill vector 
    tmp *= tmp #rbf part 1
    tmp *= np.log(tmp) # rbf part 2
    tmp[np.isnan(tmp)] = 0.0 # NaN-Testing
    
    res[j] = sum( w * tmp) #sum
return res

These code snippets were not testet. Further improvements possible. Always check if the answers are still correct. Please share your progress.

Collectives™ on Stack Overflow

Speed up numpy calculation in nested loop

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related