I am profiling some genetic algorithm code with some nested loops and from what I see most of the time is spent in two of my functions which involve slicing and adding up numpy arrays. I tried my best to further optimize them but would like to see if others come up with ideas.
Function 1:
The first function is called 2954684 times for a total time spent inside the function of 19 seconds
We basically just create views inside numpy arrays contained in data[0], according to coordinates contained in data[1]
def get_signal(data, options):
#data[0] contains bed, data[1] contains position
#forward = 0, reverse = 1
start = data[1][0] - options.halfwinwidth
end = data[1][0] + options.halfwinwidth
if data[1][1] == 0:
normals_forward = data[0]['normals_forward'][start:end]
normals_reverse = data[0]['normals_reverse'][start:end]
else:
normals_forward = data[0]['normals_reverse'][end - 1:start - 1: -1]
normals_reverse = data[0]['normals_forward'][end - 1:start - 1: -1]
row = {'normals_forward': normals_forward,
'normals_reverse': normals_reverse,
}
return row
Function 2:
Called 857 times for a total time of 13.674 seconds spent inside the function:
signal is a list of numpy arrays of equal length with dtype float, options is just random options
The goal of the function is just to add up the lists of each numpy arrays to a single one, calculate the intersection of the two curves formed by the forward and reverse arrays and return the result
def calculate_signal(signal, options):
profile_normals_forward = np.zeros(options.halfwinwidth * 2, dtype='f')
profile_normals_reverse = np.zeros(options.halfwinwidth * 2, dtype='f')
#here i tried np.sum over axis = 0, its significantly slower than the for loop approach
for b in signal:
profile_normals_forward += b['normals_forward']
profile_normals_reverse += b['normals_reverse']
count = len(signal)
if options.normalize == 1:
#print "Normalizing to max counts"
profile_normals_forward /= max(profile_normals_forward)
profile_normals_reverse /= max(profile_normals_reverse)
elif options.normalize == 2:
#print "Normalizing to number of elements"
profile_normals_forward /= count
profile_normals_reverse /= count
intersection_signal = np.fmin(profile_normals_forward, profile_normals_reverse)
intersection = np.sum(intersection_signal)
results = {"intersection": intersection,
"profile_normals_forward": profile_normals_forward,
"profile_normals_reverse": profile_normals_reverse,
}
return results
As you can see the two are very simple, but account for > 60% of my execution time on a script that can run for hours / days (genetic algorithm optimization), so even minor improvements are welcome :)
data? A Python list ornumpyarray? A list of arrays? What's thedtypeof those arrays? I'm guessingrecarray.