1

I'm trying to send a std::vector using MPI. This works fine when the the vector is small, but just doesn't work when the vector is large (more than ~15k doubles in the vector). When trying to send a vector with 20k doubles, the program just sits there with the CPU at 100%.

Here is a minimal example

#include <vector>
#include <mpi.h>

using namespace std;

vector<double> send_and_receive(vector<double> &local_data, int n, int numprocs, int my_rank) {
    MPI_Send(&local_data[0], n, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);

    if (my_rank == 0) {
        vector<double> global_data(numprocs*n);
        vector<double> temp(n);
        for (int rank = 0; rank < numprocs; rank++) {
            MPI_Recv(&temp[0], n, MPI_DOUBLE, rank, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
            for (int i = 0; i < n; i++) {
                global_data[rank*n + i] = temp[i];
            }
        }
        return global_data;
    }
    return vector<double>();
}

int main(int args, char *argv[]) {
    int my_rank, numprocs;
    // MPI initialization
    MPI_Init (&args, &argv);
    MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);
    MPI_Comm_size (MPI_COMM_WORLD, &numprocs);

    int n = 15000;
    vector<double> local_data(n);

    for (int i = 0; i < n; i++) {
        local_data[i] = n*my_rank + i;
    }

    vector<double> global_data = send_and_receive(local_data, n, numprocs, my_rank);

    MPI_Finalize();

    return 0;
}

I compile using

mpic++ main.cpp

and run using

mpirun -n 2 a.out

When I run with n = 15000 the program completes successfully, but with n = 17000 or n = 20000 it never finishes, and the two CPU's sit at 100% until I force close the program.

Does anyone know what the problem could be?

1 Answer 1

2

MPI_Send is a funny call. If there is enough internal buffers to store the input, it may return - the only guarantee it makes is that input buffer is not going to be needed further by MPI. However, if there isn't enough internal buffer space, the call will block until the opposite MPI_Recv call begins to receive data. See where this is going? Both processes post MPI_Send that block due to insufficient buffer space. When debugging issues like that, it helps to replace MPI_Send with MPI_Ssend.

Your possible solutions are:

  • Use buffered send, MPI_Bsend.
  • Use MPI_Sendrecv
  • Alternate send/recv pair so that each send has a matching recv (e.g. odd proc sends, even recvs, then vice versa).
  • Use non-blocking send, MPI_Isend

See http://www.netlib.org/utk/papers/mpi-book/node39.html

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot, I was just looking into Isend/Irecv. If I just use Isend and Irecv instead in the example, should I add MPI_Barrier(MPI_COMM_WORLD) before receiving?
No need for Irecv, you only need to have Send return. No need for barrier either.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.