0

I have a i5-8600k with 6 cores and am running a windows 10 computer. I am trying to perform multi processing with 2 numpy functions. I have made an issue before hand but I have not been successful as to making running the issue: issue, the code down below is from the answer to that issue. I am trying to run func1() and func2() at the same time however, when I run the code below it keeps on running forever.

import multiprocessing as mp
import numpy as np
num_cores = mp.cpu_count()

Numbers = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
def func1():
     Solution_1 = Numbers + 10
     return Solution_1
def func2():
     Solution_2 = Numbers * 10
     return Solution_2

# Getting ready my cores, I left one aside
pool = mp.Pool(num_cores-1)
# This is to use all functions easily
functions = [func1, func2]
# This is to store the results
solutions = []
for function in functions:
    solutions.append(pool.apply(function, ()))

enter image description here

5
  • on Linux Mint with very old procesor it runs in less then 0.03 second. But I run it normally python script.py, not in Jupyter Notebook. Commented Feb 11, 2021 at 2:58
  • Is there a reason why it might not run on jupyter notebook it uses pythons kernel? Commented Feb 11, 2021 at 6:21
  • 1
    Yes, multiprocessing requires importing the __main__ module which is not possible with an interactive session: stackoverflow.com/a/23641560/3220135 Commented Feb 11, 2021 at 6:27
  • 1
    interactive is great for prototyping, and exploratory analysis, but not for actually running code you've built Commented Feb 11, 2021 at 6:29
  • now I tested it in Jupiter Notebook and it works in 0.05 second. BTW: In both versions I had to add print(solutions) to see results. Commented Feb 11, 2021 at 11:04

1 Answer 1

2

There are several issues with the code. First, if you want to run this under Jupyter Notebook in Windows then you need to put your worker functions func1 and func2 in an external module, for example, workers.py and import them and that means you now need to either pass the Numbers array as an argument to the workers or initialize static storage of each process with the array when you initialize the pool. We will you the second method with a function called init_pool, which also has to be imported if we are running under Notebook:

workers.py

def func1():
     Solution_1 = Numbers + 10
     return Solution_1

def func2():
     Solution_2 = Numbers * 10
     return Solution_2

def init_pool(n_array):
    global Numbers
    Numbers = n_array

The second issue is that when running under Windows, the code that creates sub-processes or a multiprocessing pool must be within a block that is governed by a conditional if __name__ == '__main__':. Third, it is wasteful to create a pool size greater than 2 if you are only trying to run two parallel "jobs." And fourth, and I think finally, you are using the wrong pool method. apply will block until the "job" submitted (i.e. the one processed by func1) completes and so you are not achieving any degree of parallelism at all. You should be using apply_async.

import multiprocessing as mp
import numpy as np
from workers import func1, func2, init_pool


if __name__ == '__main__':
    #num_cores = mp.cpu_count()
    Numbers = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
    pool = mp.Pool(2, initializer=init_pool, initargs=(Numbers,)) # more than 2 is wasteful
    # This is to use all functions easily
    functions = [func1, func2]
    # This is to store the results
    solutions = []
    results = [pool.apply_async(function) for function in functions]
    for result in results:
        solutions.append(result.get()) # wait for completion and get the result
    print(solutions)

Prints:

[array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]), array([ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120])]

enter image description here

Sign up to request clarification or add additional context in comments.

6 Comments

Thank you it does work, however I could not make it work with jupyter I am running python 3.7.8 on the jupyter kernel is there a way I could make it work with the program?
I am not sure what your issue is. I am running Python 3.8.5 under Windows 10 and this worked fine under Jupyter Notebook and Jupyter Lab. I have attached to my answer an image of this. You need to describe in greater detail what "I could not make it work" means. Do you see errors on the Jupyter Notebook console, for example?
It shows that that piece of code keeps on running I am not sure as to why. I have updated the code with a snapshot showing that the asterisk is running.
My answer clearly states that functions func1, func2 and init_pool must be placed in an module and imported. Re-read my answer's description and re-read my code. I placed these in a file workers.py in the same directory as .ipynb file that contains my cells. If you look at the "console" where you started up Notebook, you will see lots of error messages if you do not do this.
I didn't think that was necessary so I ran all the functions in one python file. It works with script.py but it doesn't work with jupyter notebook. Which I don't really understand
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.