I'm trying to start a variable number of threads to compute the results of functions for one of my automated trading modules. I have about 14 functions all of which are computationally expensive. I've been calculating each function sequentially, but it takes around 3 minutes to complete, and my platform is high frequency, I have the need to cut that computation time down to 1 minute or less.
I've read up on multiprocessing and multithreading, but I can't find a solution that fits my need.
What I'm trying to do is define "n" number of threads to use, then divide my list of functions into "n" groups, then compute each group of functions in a separate thread. Essentially:
functionList = [func1,func2,func3,func4]
outputList = [func1out,func2out,func3out,func4out]
argsList = [func1args,func2args,func3args,func4args]
# number of threads
n = 3
functionSplit = np.array_split(np.array(functionList),n)
outputSplit = np.array_split(np.array(outputList),n)
argSplit = np.array_split(np.array(argsList),n)
Now I'd like to start "n" seperate threads, each processing the functions according to the split lists. Then I'd like to name the output of each function according to the outputList and create a master dict of the outputs from each function. I then will loop through the output dict and create a dataframe with column ID numbers according to the information in each column (already have this part worked out, just need the multithreading).
Is there any way to do something like this? I've been looking into creating a subclass of the threading.Thread class and passing the functions, output names, and arguments into the run() method, but I don't know how to name and output the results of the functions from each thread! Nor do I know how to call functions in a list according to their corresponding arguments!
The reason that I'm doing this is to discover the optimum thread number balance between computational efficiency and time. Like I said, this will be integrated into a high frequency trading platform I'm developing where time is my major constraint!
Any ideas?
processmodule fromthreading?threadingis the wrong module. Usemultiprocessing.Pool.mapis what you want.