1

I am trying to develop some recursive algorithm which I would like to run in parallel using open mp. I need to run the algorithm for multiple input so I want each of the Thread to run 1 input. Each Thread is independent but the results are stored in the same global variable (summing them) as the code shows:

#include <stdio.h>
#include <omp>

double * resultA;
double * resultB;

void recursiveAlgorithm(double a)
{
    double b = someFunction(a);
    uint c = anotherFunction(a);

    if (b < 0)
    {
         #pragma omp critical
         {
              resultA[c] = resultA[c] + b;
         }
         return;
    }
    if (b > 100)
    {
         #pragma omp critical
         { 
              resultB[c] = resultB[c] + b;     
         } 
         return;
    } 
    recursiveAlgorithm(b);
}


int main( int argc, const char* argv[] )
{
    double input[5] = {0, 1, 2, 3, 4};

    resultA = malloc(1000*1000*3, sizeof(double));
    resultB = malloc(1000*1000*3, sizeof(double));

    #pragma omp parallel for
    for (uint i; i < 5; i++){
         recursiveAlgorithm(input[i]);
    }
}

I have been using critical section to ensure that the variables resultA and resultB are not accessed simultaneously but I am not sure if it is the best for my case. The speed improvement is much less that I would expect. Is there any better approach for a code like this?

1 Answer 1

1

It sounds like your problem might be better solved with the reduction pattern. But its really hard to tell without more information on what you are actually computing.

See this question on how to do it for two variables and this question for the array case.

Also note that you can always implement your recursion stack yourself and parallelize the individual calls. The obvious benefit is better job balancing between thread if some recursions go much deeper than others.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, the second solve my problem. I wasn't sure if there was any easier way. Regarding the second suggestion I think that in my case I will have better performance this way since in my real code I am running the recursive algorithm a big number of times (like 10000).
When dealing with recursive parallelism it is often much easier to use tasks than parallel loops...
@JimCownie yes this is what I wanted to express, implement the recursion as a stack of tasks.
@ypnos my point is that OpenMP has support for task parallelism. So you don't need to maintain stacks of tasks, just create OpenMP tasks as you need them, and be done.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.