Truly vectorize function for numpy array in python

Question

I have the following function, which takes two 1D numpy arrays q_i and q_j, does some calculation (including taking the norm of their difference) and returns a numpy array:

import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_i - q_j)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i

Now I have a bunch of these q arrays stored in another numpy array t = [q1, q2, q3, ..., qn], and I want to evaluate the function coulomb for all unique pairs of q_i and q_j inside t (i.e. for (q1, q2), (q1, q3), ..., (q1, qn), (q2, q3), ..., (q2, qn), ... (q_n-1, qn)).

Is there a way to vectorize this calculation (and I mean really vectorize it to boost performance, because np.vectorize is only a for-loop under the hood)?

My current solution is a nested for-loop, which is far from optimal performance-wise:

for i, _ in enumerate(t):
   for j, _ in enumerate(t[i+1:]):
      f = coulomb(t[i], t[j])
      ...

Ulises Bussi · Accepted Answer · 2021-12-09 21:47:21Z

2

here 3 posible solutions, the last one, is a little caothic but uses a vectorization to calculate n q vs one. Also is the fastests

from itertools import combinations
import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_ij)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i    

def coulomb2(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_ij,axis=1).reshape(-1,1)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i    



q = np.random.randn(500,10)
from itertools import combinations
from time import time



t1= time()
v = []
for i in range(q.shape[0]):
    for j in range(i+1,q.shape[0]):
        
            v.append([coulomb(q[i], q[j])])

t2= time()

combs = combinations(range(len(q)), 2)
vv =[]
for i,j in combs:
    vv.append([coulomb(q[i], q[j])])

t3 = time()
vvv = []
for i in  range(q.shape[0]):

        vvv += list(coulomb2(q[i], q[i+1:]))
t4 = time()

print(t2-t1)
print(t3-t2)
print(t4-t3)

#0.9133327007293701
#1.0843684673309326
#0.04461050033569336

``

answered Dec 9, 2021 at 21:47

Ulises Bussi

1,7351 gold badge5 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

AAriam Over a year ago

Thank you, coulomb2 is exactly what I was looking for! The performance of coulomb2 against coulomb1 becomes much better with increasing size of q; for 3000 elements in q, it was about 100 times faster. Is there maybe a way to also get rid of the outer for loop still needed for coulomb2?

Ulises Bussi Over a year ago

I'm not sure... first guess is might exist but to me seems imposible to get rid of this loop

AAriam · Accepted Answer · 2021-12-19 20:57:15Z

1

There is a trade off between memory and speed when you want to vectorise these type numpy problems,

Loops over numpy array is slow but there is not much of memory requirement. If you want to vectorize, you have to duplicate and thereby create excess memory and pass it to numpy functions.

One way of vectorizing your function is

import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i[np.newaxis,:, :] - q_j[:,np.newaxis, :] #broadcasting, therefore creating more data
    q_ij_norm = lin.norm(q_ij, axis=2) # this can be zero when qi = qj
    f_q_i = (k * c1 * c2 / (q_ij_norm ** 3)[:,:,np.newaxis]) * q_ij #broadcasting again
    return np.nan_to_num(f_q_i) # this returns a 3D array with shape (i,j,dim_of_q). Here Nan's will be replaced with 0's

t = np.random.rand(500,3)
f_q = coulomb(t, t)

f_q(1,2,:) #will return `coulomb` between q1 and q2

edited Dec 19, 2021 at 20:57

AAriam

4578 silver badges20 bronze badges

answered Dec 9, 2021 at 21:35

user10289025

1 Comment

AAriam Over a year ago

Thank you, but this is faster than my solution only when the size of q is small (~<1000). For larger q, it becomes about as fast as my initial solution with the nested for loops. I think it's because it is calculating a lot of unnecessary values along the way as well. Is there maybe a way to avoid calculating all the unwanted values?

Collectives™ on Stack Overflow

Truly vectorize function for numpy array in python

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related