1

I was looking at a program using the FORTRAN version of OPENMP. There I encountered a rather strange construct. For making a "do" loop parallel the following construct was used

count = 0
!$OMP PARALLEL     
    do
      if (count > N ) exit
        !$OMP CRITICAL
           count = count + 1
        !$OMP END CRITICAL

        call WORK ! do some work here
     end do
 !$OMP END PARALLEL

I am not really sure, whether the above code is really making the do loop parallel. I know that the standard method of doing this is to use the following work sharing construct.

!$OMP PARALLEL     
!$OMP DO
   do count = 1,N
        call WORK ! do some work here
     end do
!$OMP ENDDO
!$OMP END PARALLEL

I have tested both possibilities, by implementing the standard work-sharing construct, and observed some speedup, when using it. I can imagine that the $OMP CRITICAL construct might be acting as a bottle neck and causing some slowing down. I think that using the non standard method for work sharing, might be beneficial if threads were executing at different speed. However, I am not really sure how accurate my thoughts are.

Thank you in advance

Alex

2 Answers 2

2

Your thoughts are correct and what the original code does is the equivalent of:

!$OMP PARALLEL DO SCHEDULE(DYNAMIC) 
do count = 0,N
  call WORK ! do some work here
end do
!$OMP END PARALLEL DO

Basically it implements dynamic loop scheduling in a very awkward and inefficient way. If call WORK takes always the same amount of time, i.e. there are no conditions that result in work imbalance, then the SCHEDULE(DYNAMIC) clause can be replaced by SCHEDULE(STATIC) for improved performance.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your reply. Quite happy having my humble assumptions confirmed. I was suspecting this would turn out being a soft soap.
1

If the !$omp do is not used, then every thread does the same as the other ones. The omp do directive is meant to divide the space of indexes of the loop between the threads. The threads don't do the same work, but run different iterations of the numbered loop. In your case there are no indexes here, so nothing to be divided, all threads will run the same code.

For just updating the count omp atomic could be used, but if WORK takes long enough to compute it may be negligible. I am worried that there is also problem reading count in the condition in the original code.

count = 0
!$OMP PARALLEL     
    do
      !$OMP ATOMIC READ
      count2 = count
      if (count2 > N ) exit
        !$OMP ATOMIC UPDATE
           count = count + 1

        call WORK ! do some work here
     end do
 !$OMP END PARALLEL

The intention here is to call WORK N times and do individual calls in parallel. There may be more elegant ways to achieve this.

1 Comment

Hi, thank you for your reply. It turns out that my initial assumptions were indeed correct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.