Python: parallel execution of a function which has a sequential loop inside

Question

I am reproducing some simple 10-arm bandit experiments from Sutton and Barto's book Reinforcement Learning: An Introduction. Some of these require significant computation time so I tried to get the advantage of my multicore CPU.

Here is the function which i need to run 2000 times. It has 1000 sequential steps which incrementally improve the reward:

import numpy as np

def foo(eps): # need an (unused) argument to use pool.map()
    # initialising
    # the true values of the actions
    q = np.random.normal(0, 1, size=10)
    # the estimated values
    q_est = np.zeros(10)
    # the counter of how many times each of the 10 actions was chosen
    n = np.zeros(10)

    rewards = []
    for i in range(1000):
        # choose an action based on its estimated value
        a = np.argmax(q_est)
        # get the normally distributed reward 
        rewards.append(np.random.normal(q[a], 1)) 
        # increment the chosen action counter
        n[a] += 1 
        # update the estimated value of the action
        q_est[a] += (rewards[-1] - q_est[a]) / n[a] 
    return rewards

I execute this function 2000 times to get (2000, 1000) array:

reward = np.array([foo(0) for _ in range(2000)])

Then I plot the mean reward across 2000 experiments:

import matplotlib.pyplot as plt
plt.plot(np.arange(1000), reward.mean(axis=0))

sequential plot

which fully corresponds the expected result (looks the same as in the book). But when I try to execute it in parallel, I get much greater standard deviation of the average reward:

import multiprocessing as mp
with mp.Pool(mp.cpu_count()) as pool:
    reward_p = np.array(pool.map(foo, [0]*2000))
plt.plot(np.arange(1000), reward_p.mean(axis=0))

parallel plot

I suppose this is due to the parallelization of a loop inside of the foo. As i reduce the number of cores allocated to the task, the reward plot approaches the expected shape.

Is there a way to get the advantage of the multiprocessing here while getting the correct results?

UPD: I tried running the same code on Windows 10 and sequential vs parallel and the results turned out to be the same! What may be the reason?

Ubuntu 20.04, Python 3.8.5, jupyter

Windows 10, Python 3.7.3, jupyter

@Marcin Hmm... I've just executed it on a two-core machine and got different results. — Boldyshev
– Boldyshev, Commented Nov 1, 2020 at 15:22
I'm on Ubuntu 20.04, python 3.8.5 via jupyter. Shall try it on windows right now — Boldyshev
– Boldyshev, Commented Nov 1, 2020 at 15:32
Wow, windows shows the same graph after parallel execution as after sequential! I wonder what are the reasons... — Boldyshev
– Boldyshev, Commented Nov 1, 2020 at 15:40

Marcin · Accepted Answer · 2020-11-01 15:44:42Z

1

As we found out it is different on windows and ubuntu. It is probably because of this:

spawn The parent process starts a fresh python interpreter process. The child process will only inherit those resources necessary to run the process objects run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver.

Available on Unix and Windows. The default on Windows and macOS.

fork The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic.

Available on Unix only. The default on Unix.

Try adding this line to your code:

mp.set_start_method('spawn')

answered Nov 1, 2020 at 15:44

Marcin

1,4319 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Boldyshev Over a year ago

That worked! Thank you so much Marcin, I've been struggling this for two days! Pitty I cant upvote your answer (need 15 reputation). Got ~7-fold speed advantage

Marcin Over a year ago

No prob, I used to have a problem with forking a long time ago, so I kinda new what to look for, hahah

Collectives™ on Stack Overflow

Python: parallel execution of a function which has a sequential loop inside

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related