0

I am attempting to parallelise some code in python. Running in serial my code takes around 24 hours, however, there is a for loop where each iteration is independent of the previous iteration, so this is an ideal situation for parallelisation. A simple example of what I am trying to achieve with my code is as follows,

import scipy as sci
from multiprocessing import Pool

def mycode(args):
  for x in range(0,2000)
    y = sci.fft(data[x,:],axis=1)
    output[x,:]=y
  return output

if __name__=="__main__":
  pool=Pool(processes = 8)
  output= pool.map(mycode(args),2000)  

However, from looking at top, I can see that although python generates 9 python processes, only one is actually using any CPU power or memory. All the others are at 0%. What is the correct way to use Pool with a for loop?

2
  • 5
    What is the reconstruction function? Does it return a callable? Why is the second argument to pool.map a number (2000) and not an iterable? Why is there a mycode function in your example if it is never called? Commented Nov 25, 2013 at 15:46
  • with iterable arguement as 2000 it will only use 1 process. Commented Nov 25, 2013 at 16:01

1 Answer 1

1

As long as this data variable has been defined as global, this should work.

import scipy as sci
from multiprocessing import Pool

def mycode(x):
    y = sci.fft(data[x,:],axis=1)
    return y

if __name__=="__main__":
  pool=Pool(processes = 8)
  output= pool.map(mycode,range(2000)) 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.