2

I have been trying to write a distributed application using pytorch. I have been following tutorial here. Over there, I am using the "MPI Backend" option. According to that, I need to follow the basic steps to install pytorch and then install openmpi as conda install -c conda-forge openmpi

Unfortunately, whenever I try to run a script using mpirun mpiexec -n 2 python ptdist.py, I get the following error RuntimeError: Distributed package doesn't have MPI built in. I believe this is happening because of error in import ProcessGroupMPI code here in python.

I have tried to install openmpi from their source code as well as sudo apt-get install python-mpi4py, but am still facing the same error.

I also tried pip install mpi4py but that also does not help

Does anyone know what is the problem?

1 Answer 1

1

From https://medium.com/@esaliya/pytorch-distributed-with-mpi-acb84b3ae5fd

The MPI backend, though supported, is not available unless you compile PyTorch from its source

This suggests you should first install your favorite MPI library, and possibly mpi4py built on top of it, and then build pytorch from sources at last.

Sign up to request clarification or add additional context in comments.

2 Comments

@Gilies I followed the same tutorial from medium to setup everything. I am getting the same aforementioned error
there is an other tutorial at pytorch.org/tutorials/intermediate/dist_tuto.html. I suggest you restart from a fresh install (so you do not run python setup.py install too early). Also check the outputs in order to confirm MPI backend is found.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.