OpenMP: How to utilize recursive function in each thread?

Question

#include <stdio.h>
#include<array>
#include<vector>
#include <omp.h>

std::vector<int> pNum;
std::array<int, 4> arr;
int pGen(int);
int main()
{
    pNum.push_back(2);
    pNum.push_back(3);
    pGen(10);
    for (int i = 0; i < pNum.size(); i++)
    {
        printf("%d \n", pNum[i]);
    }
    printf("top say: %d", pNum.size());
    getchar();
}
int pGen(int ChunkSize)
{
    //
    if (pNum.size() == 50) return 0;
    int i, k, n, id;
    int state = 0;

        //
#pragma omp parallel for schedule(dynamic) private(k, n, id) num_threads(4)
        for (i = 1; i < pNum.back() * pNum.back(); i++)
        {
            //
            id = omp_get_thread_num();
            n = pNum.back() + i * 2;
            for (k = 1; k < pNum.size(); k++)
            {
                //
                if (n % pNum[k] == 0) break;
                if (n / pNum[k] <= pNum[k])
                {
                    //
#pragma omp critical
                    {
                        //
                        if (state == 0)
                        {
                            //
                            state = 1; pNum.push_back(n); printf("id: %d; number: %d \n", id, n); pGen(ChunkSize);  break;
                        }
                    }
                }   
            }
            if (state == 1) break;
        }

}

This is my code above. I am trying to find first 50 prime number with openMP scheduling for each dynamic, static and guided. I started with dynamic. And somehow I realized I have to use recursive function since I cant use do - while in parallel structures.

When I debug the code above, console opens up and close down immediately, I can only see "id:0, number:5" and an "error: blablabla(something)"

The strange thing is I never get to getchar() and output the vector I use to store prime numbers. I think this is about recursion function. Any other theories?

edit: I happened to catch the error: this is the error

@HighPerformanceMark, as the OP is using Visual Studio (apparent from the linked screenshot), s/he is stuck with OpenMP 2.0, so no explicit tasks. — Hristo Iliev
– Hristo Iliev, Commented Mar 25, 2016 at 9:56
Yes I am using VS. Is there any detailed tutorial webpage about openMP. I could not find any with deep and detailed examples. Also, can I increase the error logging and throwing features of openMP in VS? — LuckyB
– LuckyB, Commented Mar 25, 2016 at 10:02
This is getting more and more interesting. I added omp_set_nested(1); cause I use nested loops and now only thread_0 generates 50 prime numbers correctly and throws this error: 'Fatal User Error 1002: '#pragma omp barrier' improperly nested in a work-sharing construct'... I did not use barrier in anywhere in the code. I started to think VS messing with me. — LuckyB
– LuckyB, Commented Mar 25, 2016 at 10:34
If you're only looking for 50 prime numbers the overheads of thread creation and so on that are implicit in OpenMP will never be recovered! So your serial code will be faster... — Jim Cownie
– Jim Cownie, Commented Mar 29, 2016 at 8:21

Gilles · Accepted Answer · 2016-03-25 05:36:44Z

I don't know if this is significant for your algorithm, but since you add numbers in your pNum vector during the main loop, pNum.back() will change over iterations. Therefore, the boundaries of the parallelised loop will change during the loop itself: for (i = 1; i < pNum.back() * pNum.back(); i++)

This isn't supported by OpenMP. Loops can only be parallelised with OpenMP if they are in Canonical Loop Form. The link explains it in details, but it boils down for you that the boundaries should be known and fixed prior to entering the loop:

lb and b: Loop invariant expressions of a type compatible with the type of var

Therefore, your code has an Undefined Behaviour. It may or may not compile, may or may not run and can give whatever result if any (or just reformat your hard drive).

If it is not important that pNum.back() evolves over iterations, then you can simply evaluate it prior to the loop and use this value as upper bound in the for statement. But if it is important, then you'll have to find another method to parallelise your loop.

Finally, a side note: this algorithm uses nested parallelism, but you didn't explicitly allow it so, as the nested parallelism is disabled by default, only the outermost call to pGen() will generate OpenMP threads.

Collectives™ on Stack Overflow

OpenMP: How to utilize recursive function in each thread?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related