Optimizing Many Matrix Operations in Python / Numpy

Question

In writing some numerical analysis code, I have bottle-necked at a function that requires many Numpy calls. I am not entirely sure how to approach further performance optimization.

Problem:

The function determines error by calculating the following,

Code:

def foo(B_Mat, A_Mat):
    Temp = np.absolute(B_Mat)
    Temp /= np.amax(Temp)
    return np.sqrt(np.sum(np.absolute(A_Mat - Temp*Temp))) / B_Mat.shape[0]

What would be the best way to squeeze some extra performance out of the code? Would my best course of action be performing the majority of the operations in a single for loop with Cython to cut down on the temporary arrays?

Can you add the part that you call the function too?

Kasravnd
– Kasravnd

2016-07-11 16:34:14 +00:00
Commented Jul 11, 2016 at 16:34 — Kasravnd
– Kasravnd, Commented Jul 11, 2016 at 16:34

Divakar · Accepted Answer · 2016-07-11 17:38:26Z

There are specific functions from the implementation that could be off-loaded to numexpr module which is known to be very efficient for arithmetic computations. For our case, specifically we could perform squaring, summation and absolute computations with it. Thus, a numexpr based solution to replace the last step in the original code, would be like so -

import numexpr as ne

out = np.sqrt(ne.evaluate('sum(abs(A_Mat - Temp**2))'))/B_Mat.shape[0]

A further performance boost could be achieved by embedding the normalization step into the numexpr's evaluate expression. Thus, the entire function modified to use numexpr would be -

def numexpr_app1(B_Mat, A_Mat):
    Temp = np.absolute(B_Mat)
    M = np.amax(Temp)
    return np.sqrt(ne.evaluate('sum(abs(A_Mat*M**2-Temp**2))'))/(M*B_Mat.shape[0])

Runtime test -

In [198]: # Random arrays
     ...: A_Mat = np.random.randn(4000,5000)
     ...: B_Mat = np.random.randn(4000,5000)
     ...: 

In [199]: np.allclose(foo(B_Mat, A_Mat),numexpr_app1(B_Mat, A_Mat))
Out[199]: True

In [200]: %timeit foo(B_Mat, A_Mat)
1 loops, best of 3: 891 ms per loop

In [201]: %timeit numexpr_app1(B_Mat, A_Mat)
1 loops, best of 3: 400 ms per loop

Collectives™ on Stack Overflow

Optimizing Many Matrix Operations in Python / Numpy

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related