2

I'm trying to run code in parallel, but I'm confused with private/shared, etc. stuff related to openmp. I'm using c++ (msvc12 or gcc) and openmp.

The code iterates over the loop which consists of a block that should be run in parallel followed by a block that should be run when all the parallel stuff is done. It doesn't matter in which order the parallel stuff is processed. The code looks like this:

// some X, M, N, Y, Z are some constant values
const int processes = 4;
std::vector<double> vct(X);
std::vector<std::vector<double> > stackVct(processes, std::vector<double>(Y));
std::vector<std::vector<std::string> > files(processes, M)
for(int i=0; i < N; ++i)
{
  // parallel stuff
  for(int process = 0; process < processes; ++process)
  {
    std::vector<double> &otherVct = stackVct[process];
    const std::vector<std::string> &my_files = files[process];

    for(int file = 0; file < my_files.size(); ++file)
    { 
      // vct is read-only here, the value is not modified
      doSomeOtherStuff(otherVct, vct);

      // my_files[file] is read-only
      std::vector<double> thirdVct(Y);
      doSomeOtherStuff(my_files[file], thirdVct(Y));

      // thirdVct and vct are read-only
      doSomeOtherStuff2(thirdVct, otherVct, vct);
    }
  }
  // when all the parallel stuff is done, do this job
  // single thread stuff
  // stackVct is read-only, vct is modified
  doSingleTheadStuff(vct, stackVct)
}

If it is better for performance, "doSingleThreadSuff(...)" can be moved into the parallel loop, but it needs to be processed by a single thread. The order of functions in the most inner loop cannot be changed.

How should I declare #pragma omp stuff to make it working? Thanks!

2 Answers 2

1

To run a for loop in parallel is just #pragma omp parallel for above the for loop statement and whatever variables are declared outside the for loop are shared by all the threads and whatever variables are declared inside the for loop are private to each thread.

Note that if you are doing file IO in parallel you may not see much speedup (next to none if all you are doing is file IO) unless at least some of the files reside on different physical hard drives.

Sign up to request clarification or add additional context in comments.

2 Comments

I'm doing some IO stuff in "doSomeOtherStuff(...)" but I can pre-load everything into the memory
and thx... if it is so simply, that it implies that the reason why it crashes is not in my code, but in one of the libraries I'm using...
1

Maybe something like this (mind you this is just a sketch, I did not verify it but you can get the idea):

// some X, M, N, Y, Z are some constant values
const int processes = 4;
std::vector<double> vct(X);
std::vector<std::vector<double> > stackVct(processes, std::vector<double>(Y));
std::vector<std::vector<std::string> > files(processes, M)
for(int i=0; i < N; ++i)
{
    // parallel stuff
    #pragma omp parallel firstprivate(vct, files) shared(stackVct)
    {
        #pragma omp for
        for(int process = 0; process < processes; ++process)
        {
            std::vector<double> &otherVct = stackVct[process];
            const std::vector<std::string> &my_files = files[process];

            for(int file = 0; file < my_files.size(); ++file)
            {
                // vct is read-only here, the value is not modified
                doSomeOtherStuff(otherVct, vct);

                // my_files[file] is read-only
                std::vector<double> thirdVct(Y);
                doSomeOtherStuff(my_files[file], thirdVct(Y));

                // thirdVct and vct are read-only
                doSomeOtherStuff2(thirdVct, otherVct, vct);
            }
        }
        // when all the parallel stuff is done, do this job
        // single thread stuff
        // stackVct is read-only, vct is modified
        #pragma omp single nowait
        doSingleTheadStuff(vct, stackVct)
    }
}
  • I marked vct and files as first private because they are read only and I assumed they should not be modified, so each thread will get a copy of these variables for itself.
  • The stackVct is marked as shared among all threads because they modify it.
  • Finally only one thread will execute the doSingleTheadStuff function without forcing other threads to wait.

5 Comments

"without forcing other threads to wait." Hold on - I'm not sure if I understand well what you say. "doSingleThread" should be processed when all the parallel stuff is done... But next parallel stuff (i.e. in the next iteration) should continue after the singleThread stuff is done. Will it work like that?
btw using this: "#pragma omp parallel firstprivate(vct, files) shared(stackVct)" seems to be weird... if I use it, it runs many more processes...
There is an implicit barrier at the end of the for directive, meaning that all working threads are guaranteed to finish their work before leaving the inner for-loop. Afterwards only one thread will execute doSingleThreadStuff, the nowait causes other threads to not wait for the executing thread and proceed to consume some of the work of a new iteration of the outer for-loop. If you really want them to wait until the single thread finishes you can remove the nowait option.
Ok, thank you so much. Could you please explain also why openmp runs the parallel loop many more times if I use firstprivate?
I have no idea. The purpose of firstprivate is to make a private copy of the variable for each thread where the new copies are initialized with the value of the original variable. It should not affect the size of threads team. You can check the number of threads running anytime by using omp_get_num_threads function, you can also use omp_get_thread_num to get the ID of the running thread. Use those for debugging and verifying this behavior.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.