0

There has to be a faster way of doing this.

There is a lot going on here, but it's fairly straightforward to unpack.

Here is the relevant python code (from scipy import *)

for i in arange(len(wav)):
    result[i] = sum(laser_flux * exp(-(wav[i] - laser_wav)**2) )

There are a bunch of arrays.

  • result -- array of length (wav)
  • laser_flux -- array of length (laser)
  • wav -- array of length (wav)
  • laser_wav -- array of length (laser)

Yes, within the exponential, I am squaring (element by element) the difference between the scalar value and the array of laser_wav.

Everything works as expected (including SLOWLY) any help you can give me to eliminate this for-loop would be much appreciated!

2
  • 5
    Tip: include complete, runnable sample code, so people don't have to spend time parsing out what's what, and so everyone has the same sample inputs and outputs to compare. Commented Jul 31, 2009 at 3:20
  • 3
    Not posting as an answer because I doubt the usefulness of my answer, but feel it's worth pointing out in a comment. Does Python have linear algebra libraries? If so, it may be faster to express your squaring/summing as a matrix multiplication. Just a brainstorm-style idea to consider. Commented Jul 31, 2009 at 3:24

4 Answers 4

13

You're going to want to use Numpy arrays (if you're not already) to store your data. Then, you can take advantage of array broadcasting with np.newaxis. For each value in wav, you're going to want to compute a difference between that value and each value in laser_wav. That suggests that you'll want a two-dimensional array, with the two dimensions being the wav dimension and the laser dimension.

In the example below, I'll pick the first index as the laser index and the second index as the wav index. With sample data, this becomes:

import numpy as np

LASER_LEN  = 5
WAV_LEN    = 10
laser_flux = np.arange(LASER_LEN)
wav        = np.arange(WAV_LEN)
laser_wav  = np.array(LASER_LEN)

# Tile wav into LASER_LEN rows and tile laser_wav into WAV_LEN columns
diff    = wav[np.newaxis,:] - laser_wav[:,np.newaxis]
exp_arg = -diff ** 2
sum_arg = laser_flux[:,np.newaxis] * np.exp(exp_arg)

# Now, the resulting array sum_arg should be of size (LASER_LEN,WAV_LEN)
# Since your original sum was along each element of laser_flux/laser_wav, 
# you'll need to sum along the first axis.
result = np.sum(sum_arg, axis=0)

Of course, you could just condense this down into a single statement:

result = np.sum(laser_flux[:,np.newaxis] * 
                np.exp(-(wav[np.newaxis,:]-laser_wav[:,np.newaxis])**2),axis=0)

Edit:

As noted in the comments to the question, you can take advantage of the "sum of multiplications" inherent in the definition of linear-algebra style multiplications. This then becomes:

result = np.dot(laser_flux, 
    np.exp(-(wav[np.newaxis,:] - laser_wav[:,np.newaxis])**2))
Sign up to request clarification or add additional context in comments.

2 Comments

It's also worth noting that using numpy will be much faster than any pure Python alternative. numpy executes its calculations in C.
That is definitely true - I noticed an enormous speedup when I made this type of change in one of my applications.
2

I'm new to Python, so this may not the most optimal in Python, but I'd use the same technique for Perl, Scheme, etc.

def func(x):
    delta = x - laser_wav
    return sum(laser_flux * exp(-delta * delta))
result = map(func, wav)

6 Comments

wouldn't the function call to sq(x) add some time? it might be more to write but if you're really worried about speed it seems like actually writing out the x*x would be better than calling the function once.
I believe it's generally considered more pythonic thse days to use a list comprehension instead of map(). But I can't really see either solution providing a significant (or even measurable?) speed increase..
@Casey: I've removed sq, for that reason. @John: You're probably right, but I'm really unfamiliar with list comprehensions, because I'm much more used to languages with map (Perl, Scheme, etc.). But thanks for the tip!
result = [func(x) for x in wav]
func() does not return anything. So result will be a list of Nones.
|
1

If raw performance is an issue, you might benefit from rewriting to take advantage of multiple cores, if you have them.

from multiprocessing import Pool
p = Pool(5) # about the number of cores you have

def f(i):
    delta = wav[i] - laser_wav
    return sum(laser_flux * exp(-delta*delta) )

result = p.map(f, arange(len(wav)) )

Comments

0

For one thing, it seems to be slightly quicker to multiply a variable by itself than to use the ** power operator:

~$ python -m timeit -n 100000 -v "x = 4.1; x * x"
raw times: 0.0904 0.0513 0.0493
100000 loops, best of 3: 0.493 usec per loop
~$ python -m timeit -n 100000 -v "x = 4.1; x**2"
raw times: 0.101 0.147 0.118
100000 loops, best of 3: 1.01 usec per loop

1 Comment

If you're doing optimizations at this level, you're really close to where you should be writing a native module to handle the math. There might be higher-level optimizations to be made, though (waiting on sample data before I look)...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.