5

I want to execute some processes in parallel and wait until they finish. So I wrote this code:

pool = mp.Pool(5)
for a in table:
    pool.apply(func, args = (some_args))
pool.close()
pool.join()

Will I get 5 processes executing func in parallel here? Or the only option is apply_async?

2 Answers 2

5

The docs are quite clear on this: each call to apply blocks until the result is ready. Use apply_async.

Sign up to request clarification or add additional context in comments.

6 Comments

So why do I need pool? Why do I need multiprocessing at all? With apply_ascync you have to specify callback. I want something that executes process in parallel, waits for them all, and then does something else. In particular I have table of tables of tasks. I want to iterate through the table execute each chunk of processes, wait for this chunk to finish, and then execute next chunk of processes in parallel etc. With callback this is extremely compilcated...
@mnowotka No, the callback is optional. You can access the return value of func either via the callback or by calling get on the object returned by apply_async. In your example you don't use the return value at all. In that case you could use apply_async without further changes.
No, callback is also to get information that all the processes finished. Even if I don't care about return values In order to create another chunk of process I need to know that the previous chunk has finished. Implementing this with callbacks is real pain, that's why javascript has many implementations of promises when you can say when(task1, task2).then(task3).
@mnowotka You could consider mp.map instead.
@mnowotka map(...) waits for the result, equivalent to map_async(...).get(). Unlike apply, map is able to initiate parallel work when it calls the function multiple times (for each item in the iterable argument).
|
3

Another solution is to use Pool.imap_unordered()

The following code starts a pool of 5 workers. It then sends three jobs to the pool. The first one is num=1, second num=2, etc. The function imap_unordered means that when the first result appears, from any worker, return it for further processing. In this case, the loop prints results as they appear, which isn't in any specific order.

import multiprocessing

def calc(num):
    return num*2

pool = multiprocessing.Pool(5)
for output in pool.imap_unordered(calc, [1,2,3]):
    print 'output:',output

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.