How to speed up this Python code?

Question

I've got the following tiny Python method that is by far the performance hotspot (according to my profiler, >95% of execution time is spent here) in a much larger program:

def topScore(self, seq):
    ret = -1e9999
    logProbs = self.logProbs  # save indirection
    l = len(logProbs)
    for i in xrange(len(seq) - l + 1):
        score = 0.0
        for j in xrange(l):
            score += logProbs[j][seq[j + i]]
        ret = max(ret, score)

    return ret

The code is being run in the Jython implementation of Python, not CPython, if that matters. seq is a DNA sequence string, on the order of 1,000 elements. logProbs is a list of dictionaries, one for each position. The goal is to find the maximum score of any length l (on the order of 10-20 elements) subsequence of seq.

I realize all this looping is inefficient due to interpretation overhead and would be a heck of a lot faster in a statically compiled/JIT'd language. However, I'm not willing to switch languages. First, I need a JVM language for the libraries I'm using, and this kind of constrains my choices. Secondly, I don't want to translate this code wholesale into a lower-level JVM language. However, I'm willing to rewrite this hotspot in something else if necessary, though I have no clue how to interface it or what the overhead would be.

In addition to the single-threaded slowness of this method, I also can't get the program to scale much past 4 CPUs in terms of parallelization. Given that it spends almost all its time in the 10-line hotspot I've posted, I can't figure out what the bottleneck could be here.

I can't quite get my head around the data structure you are using. Could you post a shortened sample of seq and logProbs? — Katriel
– Katriel, Commented Nov 17, 2010 at 22:32
My first thought was numpy, so maybe something on this page might be of use: stackoverflow.com/questions/316410/… — Russell Borogove
– Russell Borogove, Commented Nov 17, 2010 at 22:41
My second thought is everting the iteration such that you go over seq only one time, but that probably means that logProbs and score become more complex, and may not actually reduce work done. — Russell Borogove
– Russell Borogove, Commented Nov 17, 2010 at 22:43
@Russell: No numpy in Jython, though I think you should be able to access Java's numerics. — Katriel
– Katriel, Commented Nov 17, 2010 at 22:53
@Fred Larson: Sorry, I meant make each item in the logProbs list a list instead of a dictionary. There's only a small number of possible values that seq[index] can have, I believe. This would entail logically mapping each possible value to an index, but that's probably faster than hashing each value from the sequence to lookup its value if it's a dictionary. — martineau
– martineau, Commented Nov 18, 2010 at 1:21

Rohan Monga · Accepted Answer · 2010-11-18 05:53:13Z

2

if topScore is called repeatedly for same seq you could memoize its value.

E.g. http://code.activestate.com/recipes/52201/

answered Nov 18, 2010 at 5:53

Rohan Monga

1,7751 gold badge19 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

dsimcha Over a year ago

While I was hoping to get something a little more insightful out of this post, I'll accept this because it's what I ended up doing.

Rohan Monga Over a year ago

That recipe is basically showing how to write a decorator that captures the return values and maps it to inputs. I thought I'd give you code to do that, cause I always like to use other people's code instead of writing my own

John La Rooy · Accepted Answer · 2010-11-17 23:01:51Z

2

The reason it is slow is because it is O(N*N)

The maximum subsequence algorithm may help you improve this

answered Nov 17, 2010 at 23:01

John La Rooy

306k54 gold badges378 silver badges513 bronze badges

1 Comment

dsimcha Over a year ago

This problem is a little bit different than maximum subsequence, just enough that the proposed solution doesn't quite work.

cwallenpoole · Accepted Answer · 2010-11-18 00:52:45Z

1

What about precomputing xrange(l) outside the for i loop?

edited Nov 18, 2010 at 0:52

cwallenpoole

82.4k26 gold badges132 silver badges174 bronze badges

answered Nov 17, 2010 at 23:19

Wangnick

7351 gold badge8 silver badges14 bronze badges

Comments

mouad · Accepted Answer · 2010-11-18 12:28:46Z

1

i don't have any idea what i'm doing but maybe this can help speed up your algo:

ret = -1e9999
logProbs = self.logProbs  # save indirection
l = len(logProbs)

scores = collections.defaultdict(int)

for j in xrange(l):
    prob = logProbs[j]
    for i in xrange(len(seq) - l + 1):
        scores[i] += prob[seq[j + i]]


ret = max(ret, max(scores.values()))

edited Nov 18, 2010 at 12:28

answered Nov 17, 2010 at 23:17

mouad

70.4k18 gold badges117 silver badges106 bronze badges

Comments

Thomas K · Accepted Answer · 2010-11-17 22:48:05Z

0

Nothing jumps out as being slow. I might rewrite the inner loop like this:

score = sum(logProbs[j][seq[j+i]] for j in xrange(l))

or even:

seqmatch = zip(seq[i:i+l], logProbs)
score = sum(posscores[base] for base, posscores in seqmatch)

but I don't know that either would save much time.

It might be marginally quicker to store DNA bases as integers 0-3, and look up the scores from a tuple instead of a dictionary. There'll be a performance hit on translating letters to numbers, but that only has to be done once.

answered Nov 17, 2010 at 22:48

Thomas K

40.7k7 gold badges88 silver badges89 bronze badges

1 Comment

martineau Over a year ago

Might want to use math.fsum() if accuracy matters.

kiyo · Accepted Answer · 2010-11-17 23:25:23Z

Definitely use numpy and store logProbs as a 2D array instead of a list of dictionaries. Also store seq as a 1D array of (short) integers as suggested above. This will help if you don't have to do these conversions every time you call the function (doing these conversions inside the function won't save you much). You can them eliminate the second loop:

import numpy as np
...
print np.shape(self.logProbs) # (20, 4)
print np.shape(seq) # (1000,)
...
def topScore(self, seq):
ret = -1e9999
logProbs = self.logProbs  # save indirection
l = len(logProbs)
for i in xrange(len(seq) - l + 1):
    score = np.sum(logProbs[:,seq[i:i+l]])
    ret = max(ret, score)

return ret

What you do after that depends on which of these 2 data elements changes the most often:

If logProbs generally stays the same and you want to run many DNA sequences through it, then consider stacking your DNA sequences as a 2D array. numpy can loop through the 2D array very quickly so if you have 200 DNA sequences to process, it will only take a little longer than a single.

Finally, if you really need speed up, use scipy.weave. This is a very easy way to write a few lines of fast C to accelerate you loops. However, I recommend scipy >0.8.

John Machin · Accepted Answer · 2010-11-17 23:32:57Z

0

You can try hoisting more than just self.logProbs outside the loops:

def topScore(self, seq):
    ret = -1e9999
    logProbs = self.logProbs  # save indirection
    l = len(logProbs)
    lrange = range(l)
    for i in xrange(len(seq) - l + 1):
        score = 0.0
        for j in lrange:
            score += logProbs[j][seq[j + i]]
        if score > ret: ret = score # avoid lookup and function call

    return ret

answered Nov 17, 2010 at 23:32

John Machin

83.2k12 gold badges147 silver badges193 bronze badges

Comments

Russell Borogove · Accepted Answer · 2010-11-17 23:35:26Z

0

I doubt it will make a significant difference, but you could try changing:

  for j in xrange(l):
        score += logProbs[j][seq[j + i]]

to

  for j,lP in enumerate(logProbs):
        score += lP[seq[j + i]]

or even hoisting that enumeration outside the seq loop.

answered Nov 17, 2010 at 23:35

Russell Borogove

19.1k4 gold badges46 silver badges53 bronze badges

Collectives™ on Stack Overflow

How to speed up this Python code?

8 Answers 8

2 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

2 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related