Calling a subprocess within a script using mpi4py

Question

I’m having trouble calling an external program from my python script in which I want to use mpi4py to distribute the workload among different processors.

Basically, I want to use my script such that each core prepares some input files for calculations in separate folders, then starts an external program in this folder, waits for the output, and then, finally, reads the results and collects them.

However, I simply cannot get the external program call to work. On my search for a solution to this problem I've found that the problems I'm facing seem to be quite fundamental. The following simple example makes this clear:

#!/usr/bin/env python
import subprocess

subprocess.call(“EXTERNAL_PROGRAM”, shell=True)
subprocess.call(“echo test”, shell=True)

./script.py works fine (both calls work), while mpirun -np 1 ./script.py only outputs test. Is there any workaround for this situation? The program is definitely in my PATH, but it also fails if I use the abolute path for the call.

This SO question seems to be related, sadly there are no answers...

EDIT:

In the original version of my question I’ve not included any code using mpi4py, even though I mention this module in the title. So here is a more elaborate example of the code:

#!/usr/bin/env python

import os
import subprocess

from mpi4py import MPI


def worker(parameter=None):
    """Make new folder, cd into it, prepare the config files and execute the
    external program."""

    cwd = os.getcwd()
    dir = "_calculation_" + parameter
    dir = os.path.join(cwd, dir)
    os.makedirs(dir)
    os.chdir(dir)

    # Write input for simulation & execute
    subprocess.call("echo {} > input.cfg".format(parameter), shell=True)
    subprocess.call("EXTERNAL_PROGRAM", shell=True)

    # After the program is finished, do something here with the output files
    # and return the data. I'm using the input parameter as a dummy variable
    # for the processed output.
    data = parameter

    os.chdir(cwd)

    return data


def run_parallel():
    """Iterate over job_args in parallel."""

    comm = MPI.COMM_WORLD
    size = comm.Get_size()
    rank = comm.Get_rank()

    if rank == 0:
        # Here should normally be a list with many more entries, subdivided
        # among all the available cores. I'll keep it simple here, so one has
        # to run this script with mpirun -np 2 ./script.py
        job_args = ["a", "b"]
    else:
        job_args = None

    job_arg = comm.scatter(job_args, root=0)
    res = worker(parameter=job_arg)
    results = comm.gather(res, root=0)

    print res
    print results

if __name__ == '__main__':
    run_parallel()

Unfortunately I cannot provide more details of the external executable EXTERNAL_PROGRAM other than that it is a C++ application which is MPI enabled. As written in the comment section below, I suspect that this is the reason (or one of the resons) why my external program call is basically ignored.

Please note that I’m aware of the fact that in this situation, nobody can reproduce my exact situation. Still, however, I was hoping that someone here already ran into similar problems and might be able to help.

For completeness, the OS is Ubuntu 14.04 and I’m using OpenMPI 1.6.5.

why do you use mpirun to run non-mpi-enabled python script? — jfs
– jfs, Commented Jan 5, 2015 at 8:25
In my search for a minimal working example I might have overdone it. However, the given example still illustrates the main point of my problem: I’m unable to call a specific external program from a python script which is run in an MPI environment. What I’ve not yet mentioned is that this external program is itself MPI-enabled, so this might explain the different behaviour I’m experiencing with the above subprocess calls. — nilfisque
– nilfisque, Commented Jan 5, 2015 at 20:47
the code example is not complete, see how to create a minimal complete code example. How can I (or anybody else) reproduce your issue? (provide specific steps) I don't see where mpi4py is used. What is your environment (OS, what mpi implementation, versions, etc)? — jfs
– jfs, Commented Jan 6, 2015 at 6:00
Since this is probably too lengthy for a comment, I’ve edited my original post. — nilfisque
– nilfisque, Commented Jan 6, 2015 at 17:18
What is the specific issue? How does the code fail? Describe using words: what do you expect to happen and what happens instead? What happens if you replace EXTERNAL_PROGRAM with echo abc or a hello world mpi program? Your example is not minimal, remove code that is not required to reproduce the issue. — jfs
– jfs, Commented Jan 7, 2015 at 4:20

Al Conrad · Accepted Answer · 2015-12-04 21:45:16Z

1

In your first example you might be able to do this:

#!/usr/bin/env python
import subprocess

subprocess.call(“EXTERNAL_PROGRAM && echo test”, shell=True)

The python script is only facilitating the MPI call. You could just as well write a bash script with command “EXTERNAL_PROGRAM && echo test” and mpirun the bash script; it would be equivalent to mpirunning the python script.

The second example will not work if EXTERNAL_PROGRAM is MPI enabled. When using mpi4py it will initialize the MPI. You cannot spawn another MPI program once you initialized the MPI environment in such a manner. You could spawn using MPI_Comm_spawn or MPI_Comm_spawn_multiple and -up option to mpirun. For mpi4py refer to Compute PI example for spawning (use MPI.COMM_SELF.Spawn).

answered Dec 4, 2015 at 21:45

Al Conrad

1,64820 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Calling a subprocess within a script using mpi4py

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related