I would like to "nest" parallel for using OpenMP. Here is a toy code:
#include <iostream>
#include <cmath>
void subproblem(int m) {
#pragma omp parallel for
for (int j{0}; j < m; ++j) {
double sum{0.0};
for (int k{0}; k < 10000000; ++k) {
sum += std::cos(static_cast<double>(k));
}
#pragma omp critical
{ std::cout << "Sum: " << sum << std::endl; }
}
}
int main(int argc, const char *argv[]) {
int n{2};
int m{8};
#pragma omp parallel for
for (int i{0}; i < n; ++i) {
subproblem(m);
}
return 0;
}
Here is what I want:
- If n >= (number of cores on my machine), I want only the first loop to be parallelized.
- If n < (number of cores on my machine), I want OpenMP to launch thread in the inner loop, but I don't want the total number of threads to exceed the number of cores on my machine.
So far, I have only found a solution that disables nested parallelism or always allow it, but I am looking at a way to enable it only if the number of threads launched is below the number of cores.
Is there an OpenMP solution for that using tasks?