How to use python multiprocessing Pool.map within loop

Question

I am running a simulation using Runge-Kutta. At every time step two FFT of two independent variables are necessary which can be parallelized. I implemented the code like this:

from multiprocessing import Pool
import numpy as np

pool = Pool(processes=2)    # I like to calculate only 2 FFTs parallel 
                            # in every time step, therefor 2 processes

def Splitter(args):
    '''I have to pass 2 arguments'''
    return makeSomething(*args):

def makeSomething(a,b):
    '''dummy function instead of the one with the FFT'''
    return a*b

def RungeK():
    # ...
    # a lot of code which create the vectors A and B and calculates 
    # one Kunge-Kutta step for them 
    # ...

    n = 20                         # Just something for the example
    A = np.arange(50000)
    B = np.ones_like(A)

    for i in xrange(n):                  # loop over the time steps
        A *= np.mean(B)*B - A
        B *= np.sqrt(A)
        results = pool.map(Splitter,[(A,3),(B,2)])
        A = results[0]
        B = results[1]

    print np.mean(A)                                 # Some output
    print np.max(B)

if __name__== '__main__':
    RungeK()

Unfortunately python generates a unlimited number of processes after reaching the loop. Before it seems that only two processes are running. Also my memory fills up. Adding a

pool.close()
pool.join()

behind the loop does not solve my problem, and to put it inside the loop makes no sense for me. Hope you can help.

In this case I don't think using multiprocessing will gain you much, given the overhead of transferring numpy arrays between processes. You could try using shared multiprocessing.Array objects, though. — Roland Smith
– Roland Smith, Commented Mar 22, 2014 at 19:53
sounds interesting, I will try to figure out how this can work, tomorrow. — Peter
– Peter, Commented Mar 22, 2014 at 23:28

Roland Smith · Accepted Answer · 2014-03-22 20:02:06Z

2

Move the creation of the pool into the RungeK function;

def RungeK():
    # ...
    # a lot of code which create the vectors A and B and calculates
    # one Kunge-Kutta step for them
    # ...

    pool = Pool(processes=2)
    n = 20                         # Just something for the example
    A = np.arange(50000)
    B = np.ones_like(A)

    for i in xrange(n):  # loop over the time steps
        A *= np.mean(B)*B - A
        B *= np.sqrt(A)
        results = pool.map(Splitter, [(A, 3), (B, 2)])
        A = results[0]
        B = results[1]
    pool.close()
    print np.mean(A)  # Some output
    print np.max(B)

Alternatively, put it in the main block.

This is probably a side effect of how multiprocessing works. E.g. on MS windows, you need to be able to import the main module without side effects (like creating new processes).

answered Mar 22, 2014 at 20:02

Roland Smith

43.7k3 gold badges69 silver badges98 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Peter Over a year ago

Thanks a lot, it worked. Unfurtunatly it does not speed up my code. It seems your right and the transfer of the numpy arrays needs a some time.

Collectives™ on Stack Overflow

How to use python multiprocessing Pool.map within loop

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related