1

Here're some snippet of code.

def rbf(r):
    rsq = r ** 2.0
    if rsq == 0.0:
        return 0.0
    val = rsq * np.log(rsq)
    if np.isnan(val):
        return 0.0
    return val

for i in range(449):
    for j, row in enumerate(p):
        res[j] += w[i] * rbf(r=np.linalg.norm(p[i] - row))
return res

Here p represents an array with shape (4448^2, 3), res represents an array with shape (4448^2, ). This process of in-place addition cosumes too much time. I tried another way as following.

def rbf(r):
    rsq = r ** 2.0
    if rsq == 0.0:
        return 0.0
    val = rsq * np.log(rsq)
    if np.isnan(val):
        return 0.0
    return val

def func(elem, row):
    summand = sum([w[j]
                  * rbf(r=np.linalg.norm(p[j] - row))
                  for j in range(449)])
    return summand + elem

dst_res = np.array([func(elem=elem, row=row)
                    for elem, row in zip(res, p)])
return dst_res

But I still failed to see any progress. Any advice for improving performance ?

1
  • What is w? Can you express this as a pure function with inputs (w, p?) and outputs? Why 449? What does that have to do with 4448^2? Please provide a proper MCVE that can be run in the terminal with no additional typing. Commented Aug 23, 2020 at 1:20

1 Answer 1

1

There are several ways to speed up your program.

At first I would restructure the computational function: avoid ifs or try to put them in the outermost loop, avoid computational intense functions such as power (**).

def rbf(r):
    #if (r==0) or (np.isinf(r)) or (np.isnan(r)):
        #return 0.0
        # NaN-Test is forwarded to the outer loop; compare computational times!
    rsq = r * r
    return rsq * np.log(rsq)

second: additions are comparatively fast, not much to gain by improving this. You should maybe try to vectorize your data to speed up the multiplication:

tmp = np.zeros(449) # store values in temporary vector 
for j, row in enumerate(p): # change order of the loop 
    for i in range(449):
        tmp[i] = rbf(r=np.linalg.norm(p[i] - row)) # fill vector
    #the NaN-Check should be actually made here: measure both scenarios
    tmp[np.isnan(tmp)] = 0.0
    
    res[j] = sum( w * tmp)
return res

you can reduce everything further:

tmp = np.zeros(449) # store values in temporary vector 
for j, row in enumerate(p): # change order of the loop 
    for i in range(449):
        tmp[i] = rbf(r=np.linalg.norm(p[i] - row)) # fill vector 
    tmp *= tmp #rbf part 1
    tmp *= np.log(tmp) # rbf part 2
    tmp[np.isnan(tmp)] = 0.0 # NaN-Testing
    
    res[j] = sum( w * tmp) #sum
return res

These code snippets were not testet. Further improvements possible. Always check if the answers are still correct. Please share your progress.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.