Summing up array elements using OpenMP is not working correctly for large arrays (C)

Question

I wanted to use OpenMP around a for-loop that basically sums up the array elements. For comparison, I also sum up the array elements in a serial fashion. For large array-sizes, array_size>=5795, the sums are no longer equal.

This is the OpenMP-parallelized loop:

#pragma omp parallel for reduction(+:parallel_sum)
    for (i = 0; i < array_size; i++) {
        parallel_sum += array[i];
    }

And this is the full code:

#include<stdio.h>
#include<stdlib.h>
#include<omp.h>

int main() {
    float          *array, serial_sum, parallel_sum;
    size_t          array_size, i;

    serial_sum   = 0.;
    parallel_sum = 0.;

#pragma omp parallel
    printf( "number of threads used in parallized part: %d\n", omp_get_num_threads());

    printf("enter size of array\n");
    scanf("%zu", &array_size);

    // memory allocation
    array = (float *) malloc(sizeof(float) * array_size);

    // initialize array
    for (i = 0; i < array_size; i++)
        array[i] = i;

    // calculate the array sums, parallel first, then serial
#pragma omp parallel for reduction(+:parallel_sum)
    for (i = 0; i < array_size; i++) {
        parallel_sum += array[i];
    }

    for (i = 0; i < array_size; i++)
        serial_sum = serial_sum + array[i];

    if (serial_sum == parallel_sum)
        printf("\nserial and parallel sums are equal\n");
    else
        printf("\nserial and Parallel sums are NOT equal\n");

    // free memory
    free(array);

    return 0;
}

Since the sum can be easily calculated by hand using the sum-formula by Gauss, sum = (n^2+n)/2, where n corresponds to array_size, both sums are wrong... I guess I am having some memory issues then...?

Typical float types only have ~6 decimal digits of precision. Floating-point addition is not associative if some of the operations are inexact. — user3185968
– user3185968, Commented Sep 21, 2016 at 8:58
@EOF ohhhh, of course! That is slightly embarrassing... thanks :) Type that as an answer and I mark it as solution. — Alf
– Alf, Commented Sep 21, 2016 at 9:01
Depending on your compile options, your serial_sum may involve parallel batched sums, with different partitioning from your parallel reduction. — tim18
– tim18, Commented Sep 21, 2016 at 10:28
@tim18 I was trying to keep the compiler options short, basically just gcc -fopenmp -Wall — Alf
– Alf, Commented Sep 21, 2016 at 10:43

Community · Accepted Answer · 2017-05-23 10:27:24Z

1

Since floating-point types have limited precision, addition is not associative (see Are floating point operations in C associative? and Is Floating point addition and multiplication associative?).

So a + (b + c) can evaluate unequal to (a + b) + c.

When parallelizing/vectorizing, the order operations can indeed be different than adding the numbers from lowest to highest sequentially.

If you printed out the computed values, you would likely find them quite close in this case, just not exactly equal.

edited May 23, 2017 at 10:27

CommunityBot

11 silver badge

answered Sep 21, 2016 at 9:05

user3185968

Sign up to request clarification or add additional context in comments.

2 Comments

Alf Over a year ago

and an easy fix for the above mentioned code would be double serial_sum, parallel_sum;

user3185968 Over a year ago

@Alf: Well, you'd end up running into the same problem for higher numbers. Around ~pow(10,8) or so.

Collectives™ on Stack Overflow

Summing up array elements using OpenMP is not working correctly for large arrays (C)

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related