1

I am struggled to write the code to perform like the following example similar to the Up Phase part in prefix scan and not want to use the function MPI_Scan:

WholeArray[16] = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

Processor 0 got [0 , 1 , 2 , 3] , Processor 1 got [4 , 5 , 6 , 7] 

Processor 2 got [8 , 9 , 10 , 11] , Processor 3 got [12 , 13 , 14 , 15] 

To send and sum the last array with 2 strides:

(stride 1)

Processor 0 send Array[3] , Processor 1 receive from Processor 0 and add to Array[3]

Processor 2 send Array[3], Processor 3 receive from Processor 2 and add to Array[3] 

(stride 2)

Processor 1 sends Array[3], Processor 3 receive from Processor 1 and add to Array[3]

At last I would want to use MPI_Gather to let the result be:

WholeArray = [0 , 1 , 2 , 3 , 4 , 5 , 6 ,10 , 8 , 9 , 10 , 11 , 12 , 13 ,14 , 36]

I find it hard to write code to let the program do like the following 4nodes example:

(1st stride) - Processor 0 send to Processor 1 and Processor 1 receive from Processor 0
(1st stride) - Processor 2 send to Processor 3 and Processor 3 receive from Processor 2

(2nd stride) - Processor 1 send to Processor 3 and Processor 3 receive from Processor 1

Here is the code that I have written so far:

int Send_Receive(int* my_input, int size_per_process, int rank, int size)
{

    int key = 1;
    int temp = my_input[size_per_process-1];

    while(key <= size/2)
{
    if((rank+1) % key == 0)
      {
        if(rank/key % 2 == 0)
        {
            MPI_Send(&temp, 1, MPI_INT, rank+key,0,MPI_COMM_WORLD);
        }
        else
        {
            MPI_Recv(&temp, 1, MPI_INT, rank-key,0,MPI_COMM_WORLD,MPI_STATUS_IGNORE);
            my_input[size_per_process]+= temp;
        }
        key = 2 * key;
        MPI_Barrier(MPI_COMM_WORLD);
      }
}

return (*my_input);

}
0

1 Answer 1

1

There is some issues in your code, namely 1) it always send the same temp variable across processes.

MPI_Send(&temp, 1, MPI_INT, rank+key,0,MPI_COMM_WORLD);

The temp variable is initialized before the loop:

 int temp = my_input[size_per_process-1];
 while(key <= size/2)
 { ...}

but never updated inside the loop. This leads to wrong results, since after the first stride the last element of the my_input array will be different for some processes. Instead you should do:

temp = localdata[size_per_process-1];
MPI_Send(&temp, 1, MPI_INT, rank+key, 0, MPI_COMM_WORLD);
                            

Moreover, 2) the following statement

my_input[size_per_process]+= temp;

does not add temp to the last position of the array my_input. Instead, it should be:

my_input[size_per_process-1]+= temp;

Finally, 3) there is deadlock and infinite loop issues. For starters having a call to a collective communication routine such as MPI_barrier inside a single conditional is typically a big red flag. Instead of:

while(key <= size/2)
{
   if((rank+1) % key == 0){
       ...
       MPI_Barrier(MPI_COMM_WORLD);
   }
}

you should have:

while(key <= size/2)
{
   if((rank+1) % key == 0){
       ...
   }
   MPI_Barrier(MPI_COMM_WORLD);
}

to ensure that every process calls the MPI_Barrier.

The infinite loop happens because the while condition depends on the update of the key, but the key is only update when if((rank+1) % key == 0) evaluates to true. Therefore, when if((rank+1) % key == 0) evaluates to false, the process will never update the key, and consequently, gets stuck in an infinite loop.

I running example with all the problems fixed :

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

int main(int argc, char **argv){
    int rank, mpisize, total_size = 16;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &mpisize);
    int *data = NULL;   

    if(rank == 0){
       data = malloc(total_size * sizeof(int));
       for(int i = 0; i < total_size; i++)
          data[i] = i;
    }
    int size_per_process = total_size / mpisize;
    int *localdata = malloc(size_per_process * sizeof(int));
    MPI_Scatter(data, size_per_process, MPI_INT, localdata, size_per_process, MPI_INT, 0, MPI_COMM_WORLD);

    int key = 1;
    int temp = 0;
    while(key <= mpisize/2){                      
      if((rank+1) % key == 0){
          if(rank/key % 2 == 0){    
             temp = localdata[size_per_process-1];
             MPI_Send(&temp, 1, MPI_INT, rank+key, 0, MPI_COMM_WORLD);
          }
          else {
             MPI_Recv(&temp, 1, MPI_INT, rank-key, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
             localdata[size_per_process-1]+= temp;
         }
      }
      key = 2 * key;
      MPI_Barrier(MPI_COMM_WORLD);
    }

    MPI_Gather(localdata, size_per_process, MPI_INT, data, size_per_process, MPI_INT, 0, MPI_COMM_WORLD);

    if(rank == 0){
       for(int i = 0; i < total_size; i++)
               printf("%d ", data[i]);
       printf("\n");
    }
    free(data);
    free(localdata);    
    MPI_Finalize();
    return 0;
}

Input:

[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

Output:

[0,1,2,3,4,5,6,10,8,9,10,11,12,13,14,36]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.