1

I want to run say 10 tasks, in different cpus of same function with different 10 parameters (10 tasks) each run on different cpu. how do i do it using openmp in python code. For some reasons mpi4py and multiprocessing packages are blocked in local cluster. So i am wondering, whether i can parallelize the code using openmp alone.

what i tried:


import numpy as np
import time

t1_start = time.process_time()

def func(a):
     print("helloworld")

for b in range(max):
     obs = func(b)

print(time.process_time()-t1_start) 

i want obs = func(b) to run on different processors assigned automatically for a range of values. In mpi4y package, i can use MPI.scatter() to do this automatically. But i don't know whether it is possible to do the same with openmp alone.

2
  • Sounds like it should be possible: Put the or b in range(max) loop in a C function which you then make OpenMP parallel. Commented Dec 22, 2022 at 20:43
  • 1
    @VictorEijkhout While it works, it will unfortunately be slower because of the GIL (protecting the interpreter). This is a tedious specificity of Python (mainly CPython). Other interpreters like the one of Ruby and R have the same issue. Commented Dec 22, 2022 at 21:09

1 Answer 1

2

This is not really possible in pure Python using the standard interpreter (CPython). Actually, it is possible but it is generally completely useless for computationally intensive codes. Indeed, if you use multiple threads, the Global Interpreter Lock (GIL) of CPython will prevent any parallel speed up of computationally intensive codes (it only speed up IOs mainly) since the computation will be serialized in the end. In fast, the computation will even be slower because of the lock contention. Python does not (officially) support OpenMP. It has its own threading layer. Note that some library like Numpy can release the GIL for some function but the speed up is often disappointing (because the GIL is not completely released).

If you want to use OpenMP in Python, you can use JIT/AOT compilers like Numba or Cython. Numba only supports parallel loops (using prange and the flag parallel=True) and basic reductions, so the support is minimalist (no atomic variables for CPU codes, no locks, no tasks, no way to tweak the loop scheduling, no SIMD directives, etc.). AFAIK, this is the same thing for Cython. If you need advanced feature, then you need to use languages officially supporting OpenMP like C, C++ and FORTRAN. You can also write Numba/Cython that release the GIL and call them from a C/C++/FORTRAN function using OpenMP but the OpenMP features will not be available from the Python function.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.