OpenMP loop parallelize

Question

I'm learning OpenMP and have some problems: Parallel program slower than serial, i'm confusing (1 thread vs 2 threads) My code:

#include <iostream>
#include <omp.h>
using namespace std;

int main()
{       
    int threadsNumber=1;
    int S=0;

    cout << "Enter number of threads:\n";
    cin >> threadsNumber;

    double start, end, calculationTime;
    omp_set_num_threads(threadsNumber);
    start = omp_get_wtime();

    #pragma omp parallel for reduction(+: S)
    for(int i=1;i<1000;i++) {
        S+= 10;
    }
    #pragma omp end parallel

    end = omp_get_wtime();

    calculationTime = end - start;

    cout << "Время выполнения: " << calculationTime << "\n";
    cout<<"S = "<< S <<"\n";

    return 0;
}

Results: 1 thread: 2.59876e-05 2 threads: 0.000102043

Where my mistake? Thank you!

I'm not 100% sure, but the reduction at the end may consume more time then you save using 2 threads. Perhaps a more complex calculation would give you better results — daniel m
– daniel m, Commented Nov 23, 2013 at 20:53
try 1000000000 instead of 1000. It should return 2 seconds for 1 thread and 1 second for 2 threads. If compiler optimizations are enabled then use volatile variables to prevent optimizing out the loop. — jfs
– jfs, Commented Nov 23, 2013 at 20:57
One minor thing to add: there's no "#pragma omp end parallel". Since you're writing C++ code, the end of the parallel region is automatically determined by the end of the structural block. — Michael Klemm
– Michael Klemm, Commented Nov 24, 2013 at 19:44

Philipp · Accepted Answer · 2013-11-23 21:22:33Z

2

As J.F Sebastian pointed out in a comment, you don't get much benefit from parallelization because your loop with 1000 iterations is rather quick. That means the overhead it takes to create a 2nd thread is larger than what you save due to parallelization. When you increase the number of loop iterations and thus give the threads more to do, the benefit of multi-threading becomes more apparent.

answered Nov 23, 2013 at 21:22

Philipp

70.1k10 gold badges121 silver badges159 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Hristo Iliev Over a year ago

I don't know which C++ compiler the OP uses but if it is Intel C++ Compiler or GCC with optimisation level O2 or higher, the loop is replaced with S = max(high-low, 0) * 10; (where low and high are the bounds of the iteration range), which only makes the OpenMP version even slower in comparison.

jfs Over a year ago

@HristoIliev: as I've mentioned a volatile variable would prevent the optimization.

Collectives™ on Stack Overflow

OpenMP loop parallelize

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related