Clarification regarding python Pool.map function used for python parallelism

Question

I have a couple of questions regarding the functioning of the following code fragment.

 def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=10)             # start 10 worker processes
    result = pool.apply_async(f, [10])    # evaluate "f(10)" asynchronously
    print result.get(timeout=1)           
    print pool.map(f, range(10))          # prints "[0, 1, 4,..., 81]"

In the line pool = Pool(processes=10), does it even make a difference if i'm running on 4 processor architecture (quad-core) and instantiate more than 4 worker processes since only up to 4 processes can execute at any point in time?
In thepool.map(f,range(10)) function, if I instantiate 10 worker processes, and have maybe 50 mappers does python take care of assigning mappers to processes as they complete execution or am I supposed to figure out how many mappers are created and instantiate that many number of processes in the line pool = Pool(processes=number_of_mappers) ?.

This is my first attempt at parallelizing anything and I am thoroughly confused. so any help would be much appreciated.

Thanks in advance!

After you have made the good design choices of trying to make it run efficiently on a single computer then you realize that some problems are just too big for a single computer. Adding processes enables you to throw hardware at a particular problem. — Back2Basics
– Back2Basics, Commented Oct 30, 2013 at 19:49

Tim Peters · Accepted Answer · 2013-10-30 19:41:47Z

2

If you create more worker processes than you have available CPUs, that's fine, but the processes will compete with each other for cycles. That is, you'll waste more cycles, in the sense that cycles devoted to switching among processes does nothing to get you closer to finishing. For CPU-bound tasks, it's just wasteful. For I/O-bound tasks, though, it may be just what you want, since in that case processes will spend lots of their time idle, waiting for blocking I/O to complete.
The map functions automatically slice up their iterable argument and send pieces of it to all worker processes. I really don't know what you mean by mappers, though. How many mappers do you think you created in your example? 10? 1? Something else? In what you wrote, pool.map() blocks until all work is completed.

answered Oct 30, 2013 at 19:41

Tim Peters

71.4k14 gold badges133 silver badges140 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

anonuser0428 Over a year ago

I thought it created 10 mappers, one for each value 0-9.

Tim Peters Over a year ago

Take "mappers" out of your mental model then ;-) The phrase doesn't really correspond to anything that's happening. The implementation of map() "simply" carves up the iterable and passes the elements to as many worker processes as there are. range(1000000) in your example would work fine too - although would run faster if you used the optional chunksize argument.

anonuser0428 Over a year ago

Ok in my example let's say range(10) was replaced by range(100) would it be correct to say the map function would create 100 "instances" of the function 'f' each instance with one of the values from the list 0-99. And for each of the 100 instances "pool" would pass 10 instances at a time to the worker threads at a time. As there are only 10 worker threads? I'm just trying to make sure I completely understand what's going on here before implementing this in my application.

Tim Peters Over a year ago

Only one instance of f per process. One element is passed (to one of those instances of f), by the server, at a time, unless you use the chunksize argument (if you used chunksize=10, then 10 elements would be passed to a worker at a time by the server, and the worker would pass its chunk of 10 to f one at a time. 10x fewer calls to interprocess communication then, which is why it's faster. Without chunksize, the server passes 0 to some process, 1 to some process, 2 to some process ... and 99 to some process. The name "f" also passed, but each process already has f's code.

anonuser0428 Over a year ago

perfect thanks so much for the explanation I'm much clearer about the working of the pool.map() function now.

|

Eser Aygün · Accepted Answer · 2013-10-30 19:42:41Z

1

You can create more workers than the number of threads your CPU can execute. This is required in real-time applications, like a web server, where you must ensure that each client is able communicate with you without having to wait others. If it's not a real-time application and you just want to finish all the jobs as soon as possible, it would be wiser to create as many threads as your CPU can handle simultaneously.
Python takes care of assigning jobs to workers no matter how many jobs you have.

answered Oct 30, 2013 at 19:42

Eser Aygün

8,1041 gold badge22 silver badges31 bronze badges

Collectives™ on Stack Overflow

Clarification regarding python Pool.map function used for python parallelism

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related