parallelize for loop using boost MPI

Question

I am learning to use Boost.MPI to parallelize the large amount of computation, here below is just my simple test see if I can get MPI logic correctly. However, I did not get it to work. I used world.size()=10, there are total 50 elements in data array, each process will do 5 iteration. I would hope to update data array by having each process sending the updated data array to root process, and then the root process receives the updated data array then print out. But I only get a few elements updated.

Thanks for helping me.

#include <boost/mpi.hpp>
#include <iostream>
#include <cstdlib>

namespace mpi = boost::mpi;
using namespace std;

#define max_rows 100
int data[max_rows];

int modifyArr(const int index, const int arr[]) {
  return arr[index]*2+1;
}

int main(int argc, char* argv[])
{
  mpi::environment env(argc, argv);
  mpi::communicator world;

  int num_rows = 50;
  int my_number;

  if (world.rank() == 0) {
    for ( int i = 0; i < num_rows; i++)
        data[i] = i + 1;
  }

  broadcast(world, data, 0);

  for (int i = world.rank(); i < num_rows; i += world.size()) {
    my_number = modifyArr(i, data);
    data[i]   = my_number;

    world.send(0, 1, data);

    //cout << "i=" << i << " my_number=" << my_number << endl;

    if (world.rank() == 0)
      for (int j = 1; j < world.size(); j++) 
        mpi::status s = world.recv(boost::mpi::any_source, 1, data);
  }

  if (world.rank() == 0) {
    for ( int i = 0; i < num_rows; i++)
      cout << "i=" << i << " results = " << data[i] << endl;
  }

  return 0;
}

Richard · Accepted Answer · 2015-02-27 07:20:44Z

2

Your problem is probably here:

mpi::status s = world.recv(boost::mpi::any_source, 1, data);

This is the only way data can get back to the master node.

However, you do not tell the master node where in data to store the answers it is getting. Since data is the address of the array, everything should get stored in the zeroth element.

Interleaving which elements of the array you are processing on each node is a pretty bad idea. You should assign blocks of the array to each node so that you can send entire chunks of the array at once. That will reduce communication overhead significantly.

Also, if your issue is simply speeding up for loops, you should consider OpenMP, which can do things like this:

#pragma omp parallel for
for(int i=0;i<100;i++)
  data[i]*=4;

Bam! I just split that for loop up between all of my processes with no further work needed.

answered Feb 27, 2015 at 7:20

Richard

62.7k40 gold badges198 silver badges277 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user3658306 Over a year ago

@ Richard. Thank you. In my case the OpenMP only reduce my for-loop (which is more complicated than this snippet) a little bit, so I am pursuing OpenMPI or parallel boost graph library. I will take your comment of "Interleaving which elements of the array you are processing on each node is a pretty bad idea. You should assign blocks of the array to each node so that you can send entire chunks of the array at once. That will reduce communication overhead significantly." and rewrite the code.

Collectives™ on Stack Overflow

parallelize for loop using boost MPI

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related