Fast iteration over matrix-vector products in multidimensional numpy array

Question

essentially I'm having to do a lot of Matrix-Vector products, where the 3x3 matrices are stored as the last two entries of a (n x n x 3 x 3) numpy array and the vectors are the last column in a corresponging (n x n x 3) array.

Currently I'm looping quite inelegantly over the first two indices and calculating each matrix-vector product separately:

length=1000
x=np.random.rand(length,length,3)
A=np.random.rand(length,length,3,3)
result=np.zeros((length,length,3))

for i in range(0,length):
   for j in range(0,length):
      result[i,j,:]=A[i,j,:,:].dot(x[i,j,:])

This however is quite slow. Is there a more efficient way to avoid this i,j for loop? I'm still trying to learn the numpy methods. ;)

Also, in my real-life case, the matrices are very sparse (>0.999 sparsity), so bonus points for anyone who can give me some hints on how to incorporate that.

Thank you very much!

With that sparsity you might gain some speed (and better memory use) by making (3*n,3*n) scipy sparse matrices with the block format (with dense 3x3 blocks). — hpaulj
– hpaulj, Commented Nov 9, 2021 at 17:29

Ivan · Accepted Answer · 2021-11-09 14:13:56Z

4

You could use np.einsum to perform the matrix/vector dot products over both i and j:

>>> np.einsum('ijkl,ijl->ijk', A, x)

>>> np.allclose(np.einsum('ijkl,ijl->ijk', A, x), result)
True

answered Nov 9, 2021 at 14:13

Ivan

41.3k9 gold badges78 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

NegativeJacobian Over a year ago

Thank you for that! I used einsum in another part of the code and somehow didn't think to use it here.

Ulises Bussi Over a year ago

I upvoted, your method is the faster one proposed

orlp · Accepted Answer · 2021-11-09 14:16:05Z

2

You can use broadcasting over the dot operator (@) by temporarily introducing an extra dimension in x:

result = (A @ x[:,:,:,None]).squeeze(axis=-1)

answered Nov 9, 2021 at 14:16

orlp

119k39 gold badges226 silver badges324 bronze badges

4 Comments

NegativeJacobian Over a year ago

Thanks for the answer. :) Will this also need additional memory space or is this extra dimension (somewhat) fictional?

orlp Over a year ago

@NegativeJacobian It is entirely fictional and free - it simply reinterprets the data as it sits in memory differently.

Ivan Over a year ago

Sticking with the slice notation, you can do (A @ x[:,:,:,None])[..., 0] ;)

NegativeJacobian Over a year ago

Since my matrices are very sparse, even the fastest suggested way is significantly slower than my "naive" implementation when i restrict j to only cover the nonzero entries (e.g. the "neighbors" j actually connected to i). I guess I won't get around sparse matrix structures, but maybe I should open a separate question on that.

Ulises Bussi · Accepted Answer · 2021-11-09 15:00:13Z

there are some answers to this question, I have a different method that a friend of mine taught me. I want to share it.

When you have a matrix-vector (or matrix-matrix) product suposing A NxMand x M you can make the product in numpy by reshaping x, doing the clasical product and then summing over the last dimension:

n=10
m=3
A = np.random.randn(n,m)
x = np.random.randn(m)

result1 = A.dot(x)
result2 = (A*x.reshape(1,-1)).sum(-1)
print(np.allclose(result1,result2))
#throws True

Using that you could resolve your problem for more bigger matrices:

import numpy as np

def method_1(A,x,n,m): #OP
    result=np.zeros((n,n,m))
    for i in range(0,n):
        for j in range(0,n):
            result[i,j,:]=A[i,j,:,:].dot(x[i,j,:])
    return result

def method_2(A,x,n,m): #Me 
    return (A* x.reshape(n,n,1,m)).sum(-1)

def method_3(A,x,n,m): #orlp
    return (A @ x[:,:,:,None]).squeeze(axis=-1)

def method_4(A,x,n,m): #ivan
    return np.einsum('ijkl,ijl->ijk', A, x)

n= 7
m = 3
A = np.random.randn(n,n,m,m)
x = np.random.randn(n,n,m)

r1 = method_1(A,x,n,m)
r2 = method_2(A,x,n,m)
r3 = method_3(A,x,n,m)
r4 = method_4(A,x,n,m)

r12 = np.allclose(r1,r2)
r13 = np.allclose(r1,r3)
r14 = np.allclose(r1,r4)

print(" same result?")
print(f"op-me: {r12}, op-orlp: {r13}, op-ivan: {r14}")

#same result?
#op-me: True, op-orlp: True, op-ivan: True

For last I did a comparision between all these method speed

from time import time
import matplotlib.pyplot as plt
nList = [10,100,1000,10000]

times_1 =[]
times_2 =[]
times_3 =[]
times_4 =[]
for n in nList:
    print(n)
    A = np.random.randn(n,n,m,m)
    x = np.random.randn(n,n,m)
    t1 = time()
    r1 = method_1(A,x,n,m)
    t2 = time()
    r2 = method_2(A,x,n,m)
    t3 = time()
    r3 = method_3(A,x,n,m)
    t4 = time()
    r4 = method_4(A,x,n,m)
    t5 = time()

    times_1.append(t2-t1)
    times_2.append(t3-t2)
    times_3.append(t4-t3)
    times_4.append(t5-t4)

#plot times
plt.figure()
plt.loglog(nList,times_1,'o-',label="op")
plt.loglog(nList,times_2,'o-',label="me")
plt.loglog(nList,times_3,'o-',label="orlp")
plt.loglog(nList,times_4,'o-',label="ivan")
plt.legend()

Here the results shows that the method proposed by Ivan is the faster one

Collectives™ on Stack Overflow

Fast iteration over matrix-vector products in multidimensional numpy array

3 Answers 3

2 Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related