2

I have a python code, which runs another application using subprocess.Popen and mpirun. Will the code runs perfectly fine on one machine, on the other I run into problems. But I have also an older conda environment where it works. The minimal code to reproduce is the following:

from subprocess import Popen
from mpi4py import MPI

proc = Popen("mpirun -n 2 echo 1".split())

At this line proc immediately terminates and proc.poll() returns 1. The python scrip doesn't actually use MPI, it is simply run as python script.py, it depends however on another program calling MPI. I need to repeatedly run another code with mpirun (Of course I don't actually execute echo 1).

I assume it depends on the installed MPI:

Working:

$ conda list -n ForkTPS | grep mpi
WARNING: The conda.compat module is deprecated and will be removed in a future release.
fftw                      3.3.8           mpi_mpich_hc19caf5_1012    conda-forge
h5py                      2.10.0          nompi_py38h7442b35_105    conda-forge
hdf5                      1.10.6          mpi_mpich_hc096b2c_1010    conda-forge
mpi                       1.0                       mpich    conda-forge
mpi4py                    3.0.3            py38h4a80816_2    conda-forge
mpich                     3.3.2                hc856adb_2    conda-forge

as well as

conda list | grep mpi
dask-mpi                  2.21.0                   pypi_0    pypi
fftw                      3.3.8           mpi_mpich_h3f9e1be_1011    conda-forge
hdf5                      1.10.5          mpi_mpich_ha7d0aea_1004    conda-forge
impi_rt                   2019.8                intel_254    intel
libnetcdf                 4.7.4           mpi_mpich_h755db7c_1    conda-forge
mpi                       1.0                       mpich  
mpi4py                    3.0.3            py37hf484d3e_7    intel
mpich                     3.3.2                hc856adb_0    conda-forge
netcdf4                   1.5.3           mpi_mpich_py37h91af3bc_3    conda-forge

Not working:

conda list | grep mpi
fftw                      3.3.8           mpi_openmpi_h6dd7431_1011    conda-forge
hdf5                      1.10.6          mpi_openmpi_hac320be_1    conda-forge
mpi                       1.0                     openmpi    conda-forge
mpi4py                    3.0.3            py38h246a051_2    conda-forge
openmpi                   4.0.5                hdf1f1ad_1    conda-forge

Is there a reasonable and reproducible way to avoid this issue? I have to make my code available to several collaborators. At first glance, I would say the difference is using MPICH vs OpenMPI.

2
  • what if you simply run mpirun -np 1 echo 1 from the command line? and then mpirun -np 2 echo 1? Commented Oct 29, 2020 at 0:00
  • If I just execute the command by hand, I get the correct results '1' '1\n1'. Commented Oct 29, 2020 at 13:45

2 Answers 2

1

At least with Open MPI, you cannot fork&exec mpirun from a MPI program.

Because you from mpi4py import MPI, the python script is runs in singleton mode, and hence you cannot Popen(["mpirun", ...)

Getting rid of the mpi4py line should fix your issue.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes, obviously removing the from mpi4py import MPI solves the issue, but comes with the very having cost of having to maintain a private fork of a very complicated project. Your answer indicates, that it is indeed relevant to avoid OpenMPI and instead use e.g. MPICHI.
"The python scrip doesn't actually use MPI" was an incorrect statement some might have found obvious but did not feel the need to state it. Anyway, do not incorrectly read between the lines: Open MPI alone won't work for you here, but do not conclude MPICH will always work. You might consider running your python script with mpi4py built on top of Open MPI, and invoke mpirun from MPICH. Or the other way around. The canonical way of dealing with this is to use MPI_Comm_spawn() instead of fork&exec mpirun.
1

I want to add an answer based on: https://stackoverflow.com/a/60070753/12286484

Adding env=os.environ should be a workaround

from subprocess import Popen
from mpi4py import MPI
import os
proc = Popen("mpirun -n 2 echo 1".split(), env=os.environ)

More on that workaround is also explained in the post above.

3 Comments

This is really a workaround that might not last forever.
Good point. Still I think passing an enviroment variable env to Popen is a valid workaround. As I see it for now the enviroment variables are not updated in os.environ when importing packages. As you rightly point out, this might very well change in the future. Do you think a possible futureproof workaround is to create a variable basic_env = os.environ.copy() right at the beginning of the code before importing any package other than os, and then passing this to Popen(..., env=basic_env)?
I think the right way is to prevent mpi4py from automatically invoking MPI_Init() under the hood when this is not desired (e.g. the wrapper script). That can be achieved via mpi4py.rc.initialize = False as documented at mpi4py.readthedocs.io/en/stable/mpi4py.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.