I have the following piece of c code,
double findIntraClustSimFullCoverage(cluster * pCluster)
{
double sum = 0;
register int i = 0, j = 0;
double perElemSimilarity = 0;
for (i = 0; i < 10000; i++)
{
perElemSimilarity = 0;
for (j = 0; j < 10000; j++)
{
perElemSimilarity += arr[i][j];
}
perElemSimilarity /= pCluster->size;
sum += perElemSimilarity;
}
return (sum / pCluster->size);
}
NOTE: arr is a matrix of size 10000 X 10000
This is a portion of a GA code, hence this nested for loop runs many times. This affects the performance of the code i.e. takes hell a lot of time to give the results. I profiled the code using valgrind / kcachegrind. This indicated that 70 % of the process execution time was spent in running this nested for loop. The register variables i and j, do not seem to be stored in register values (profiling with and without "register" keyword indicated this)
I simply can not find a way to optimize this nested for loop portion of code (as it is very simple and straight forward). Please help me in optimizing this portion of code.
registerkeyword is pretty much ignored by all modern compilers. So don't expect to see a difference.