OpenMP reduction on multiple variables (array)

Question

I am trying to do a reduction on multiple variables (an array) using OMP, but wasn't sure how to implement it with OMP. See the code below.

#pramga omp parallel for reduction( ??? )
for (int i = 0; i < n; i++) {
        for (int j = 0; j < m; j++) {
                [ compute value ... ]

                y[j] += value
        }
}

I thought I could do something like this, with the atomic keyword, but realised this would prevent two threads from updating y at the same time even if they are updating different values.

#pramga omp parallel for
for (int i = 0; i < n; i++) {
        for (int j = 0; j < m; j++) {
                [ compute value ... ]

                #pragma omp atomic
                y[j] += value
        }
}

Does OMP have any functionality for something like this or otherwise how would I achieve this optimally without OMP's reduction keyword?

You could declare only the j loop to be omp parallel. If that's inefficient, for instance because the loop is too short, then try to exchange the two loops. — Victor Eijkhout
– Victor Eijkhout, Commented Mar 6, 2022 at 22:21

Laci · Accepted Answer · 2022-03-06 19:11:31Z

1

There is an array reduction available in OpenMP since version 4.5:

#pramga omp parallel for reduction(+:y[:m])

where m is the size of the array. The only limitation here is that the local array used in reduction is always reserved on the stack, so it cannot be used in the case of large arrays.

The atomic operation you mentioned should work fine, but it may be less efficient than reduction. Of course, it depends on the actual circumstances (e.g. actual value of n and m, time to compute value, false sharing, etc.).

#pragma omp atomic
  y[j] += value

answered Mar 6, 2022 at 19:11

Laci

2,8381 gold badge15 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

DavieRodger Over a year ago

Ah... in my particular case y is dynamically allocated with its size determined at runtime. As you have suggested the atomic operation does work, but from my understanding it would hurt performance when unecessarily - two threads in theory could update y[i] and y[j] for different i and j, but the atomic operation would not enable them to. Is this correct?

Laci Over a year ago

atomic operation always gives correct results, and allows update different y[i] and y[j]. The only performance related problem is that if they are in the same cache line, each memory write invalidates the cache line. It is called 'false sharing'. if array y is expected to be big the best is to do the reduction manually.

Laci Over a year ago

Please read this if you wish to implement manual array reduction in OpenMP.

Laci Over a year ago

Is computation of value is slow or fast? If it is fast, is it possible to swap for loops (do for(int j=...) first?

Laci Over a year ago

I think in your comment you confused #pragma omp critical and #pragma omp atomic. #pragma omp critical will not allow more threads to do something in parallel.

Collectives™ on Stack Overflow

OpenMP reduction on multiple variables (array)

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related