Parallelising python for loop

Question

I am attempting to parallelise some code in python. Running in serial my code takes around 24 hours, however, there is a for loop where each iteration is independent of the previous iteration, so this is an ideal situation for parallelisation. A simple example of what I am trying to achieve with my code is as follows,

import scipy as sci
from multiprocessing import Pool

def mycode(args):
  for x in range(0,2000)
    y = sci.fft(data[x,:],axis=1)
    output[x,:]=y
  return output

if __name__=="__main__":
  pool=Pool(processes = 8)
  output= pool.map(mycode(args),2000)

However, from looking at top, I can see that although python generates 9 python processes, only one is actually using any CPU power or memory. All the others are at 0%. What is the correct way to use Pool with a for loop?

What is the reconstruction function? Does it return a callable? Why is the second argument to pool.map a number (2000) and not an iterable? Why is there a mycode function in your example if it is never called? — Ryan C. Thompson
– Ryan C. Thompson, Commented Nov 25, 2013 at 15:46

Rob123 · Accepted Answer · 2017-08-24 10:24:10Z

1

As long as this data variable has been defined as global, this should work.

import scipy as sci
from multiprocessing import Pool

def mycode(x):
    y = sci.fft(data[x,:],axis=1)
    return y

if __name__=="__main__":
  pool=Pool(processes = 8)
  output= pool.map(mycode,range(2000))

answered Aug 24, 2017 at 10:24

Rob123

1,1358 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Parallelising python for loop

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related