I have the following embarassingly parallel loop
//#pragma omp parallel for
for(i=0; i<tot; i++)
pointer[i] = val;
Why does uncommenting the #pragma line cause performance to drop? I'm getting a slight increase in program run time when I use openmp to parallelize this for loop. Since each access is independent, shouldn't it greatly increase the speed of the program?
Is it possible that if this for loop isn't run for large values of tot, the overhead is slowing things down?