process flow in parallel processing in python

Question

I am learning about parallel processing in python and I have some very specific doubts regarding the execution flow of the following program. In this program, I am splitting my list into two parts depending on the process. My aim is to run the add function twice parallely where one process takes one part of the list and other takes other part.

import multiprocessing as mp
x = [1,2,3,4]

print('hello')
def add(flag, q_f):
    global x
    if flag == 1:
        dl = x[0:2]
    elif flag == 2:
        dl = x[2:4]
    else:
        dl = x
    x = [i+2 for i in dl]
    print('flag = %d'%flag)
    print('1')
    print('2')
    print(x)
    q_f.put(x)

print('Above main')

if __name__ == '__main__':
    ctx = mp.get_context('spawn')
    print('inside main')
    q = ctx.Queue()
    jobs = []
    for i in range(2):
        p = mp.Process(target = add, args = (i+1, q))
        jobs.append(p)
    for j in jobs:
        j.start()
    for j in jobs:
        j.join()
    print('completed')
    print(q.get())
    print(q.get())

print('outside main')

The output which I got is

hello
Above main
outside main
flag = 1
1
2
[3, 4]
hello
Above main
outside main
flag = 2
1
2
[5, 6]
hello
Above main
inside main
completed
[3, 4]
[5, 6]
outside main

My questions are

1) From the output, we can see that one process is getting executed first, then the other. Is the program actually utilizing multiple processors for parallel processing? If not, how can I make it parallely process? If it was parallely processing, the print statements print('1') print('2') should be executed at random, right?

2) Can I check programmatically on which processor is the program running?

3) Why are the print statements outside main(hello, above main, outside main) getting executed thrice?

4) What is the flow of the program execution?

1) It could be that your code executes too quickly (first process finishes before second can get started) 2) Not sure. 3) I don't see the extra executions of print statements when I run your code. — John Anderson
– John Anderson, Commented Dec 21, 2017 at 2:11

Michael Butscher · Accepted Answer · 2017-12-21 02:12:03Z

1) The execution of add() is probably done so fast that the first execution ended already when the second process is started.

2) A process is usually not assigned to a particular CPU but jumps between them

3) If you are using Windows for each started process the module must be executed again. For these executions __name__ isn't 'main' but all unconditional commands (outside of if and such) like these prints are executed.

4) When start() of a Process is called on Windows a new Python interpreter is started which means that necessary modules are imported (and therefore executed) and necessary resources to run the subprocess are handed to the new interpreter (the "spawn"-method described at https://docs.python.org/3.6/library/multiprocessing.html#contexts-and-start-methods). All processes then run independently (if no synchronization is done by the program)

Collectives™ on Stack Overflow

process flow in parallel processing in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related