0

I have the following function, which takes two 1D numpy arrays q_i and q_j, does some calculation (including taking the norm of their difference) and returns a numpy array:

import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_i - q_j)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i

Now I have a bunch of these q arrays stored in another numpy array t = [q1, q2, q3, ..., qn], and I want to evaluate the function coulomb for all unique pairs of q_i and q_j inside t (i.e. for (q1, q2), (q1, q3), ..., (q1, qn), (q2, q3), ..., (q2, qn), ... (q_n-1, qn)).

Is there a way to vectorize this calculation (and I mean really vectorize it to boost performance, because np.vectorize is only a for-loop under the hood)?

My current solution is a nested for-loop, which is far from optimal performance-wise:

for i, _ in enumerate(t):
   for j, _ in enumerate(t[i+1:]):
      f = coulomb(t[i], t[j])
      ...

2 Answers 2

2

here 3 posible solutions, the last one, is a little caothic but uses a vectorization to calculate n q vs one. Also is the fastests

from itertools import combinations
import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_ij)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i    

def coulomb2(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_ij,axis=1).reshape(-1,1)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i    



q = np.random.randn(500,10)
from itertools import combinations
from time import time



t1= time()
v = []
for i in range(q.shape[0]):
    for j in range(i+1,q.shape[0]):
        
            v.append([coulomb(q[i], q[j])])

t2= time()

combs = combinations(range(len(q)), 2)
vv =[]
for i,j in combs:
    vv.append([coulomb(q[i], q[j])])

t3 = time()
vvv = []
for i in  range(q.shape[0]):

        vvv += list(coulomb2(q[i], q[i+1:]))
t4 = time()

print(t2-t1)
print(t3-t2)
print(t4-t3)

#0.9133327007293701
#1.0843684673309326
#0.04461050033569336

``
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, coulomb2 is exactly what I was looking for! The performance of coulomb2 against coulomb1 becomes much better with increasing size of q; for 3000 elements in q, it was about 100 times faster. Is there maybe a way to also get rid of the outer for loop still needed for coulomb2?
I'm not sure... first guess is might exist but to me seems imposible to get rid of this loop
1

There is a trade off between memory and speed when you want to vectorise these type numpy problems,

Loops over numpy array is slow but there is not much of memory requirement. If you want to vectorize, you have to duplicate and thereby create excess memory and pass it to numpy functions.

One way of vectorizing your function is

import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i[np.newaxis,:, :] - q_j[:,np.newaxis, :] #broadcasting, therefore creating more data
    q_ij_norm = lin.norm(q_ij, axis=2) # this can be zero when qi = qj
    f_q_i = (k * c1 * c2 / (q_ij_norm ** 3)[:,:,np.newaxis]) * q_ij #broadcasting again
    return np.nan_to_num(f_q_i) # this returns a 3D array with shape (i,j,dim_of_q). Here Nan's will be replaced with 0's

t = np.random.rand(500,3)
f_q = coulomb(t, t)

f_q(1,2,:) #will return `coulomb` between q1 and q2

1 Comment

Thank you, but this is faster than my solution only when the size of q is small (~<1000). For larger q, it becomes about as fast as my initial solution with the nested for loops. I think it's because it is calculating a lot of unnecessary values along the way as well. Is there maybe a way to avoid calculating all the unwanted values?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.