0

I have two programs, compiled with g++ and executed on linux. Program 1 creates a 2D array and then measures how long it takes to access all of its elements 100000 times:

#include <time.h>
#include <iostream>

int main()
{
  clock_t time;
  int i, y, x;

  int matrix[9][9]{{ 0,  1,  2,  3,  4,  5,  6,  7,  8},
                   { 9, 10, 11, 12, 13, 14, 15, 16, 17},
                   {18, 19, 20, 21, 22, 23, 24, 25, 26},
                   {27, 28, 29, 30, 31, 32, 33, 34, 35},
                   {36, 37, 38, 39, 40, 41, 42, 43, 44},
                   {45, 46, 47, 48, 49, 50, 51, 52, 53},
                   {54, 55, 56, 57, 58, 59, 60, 61, 62},
                   {63, 64, 65, 66, 67, 68, 69, 70, 71},
                   {72, 73, 74, 75, 76, 77, 78, 79, 80}};

  time = clock();

  for (i = 0; i < 100000; i++)
  {
    for (x = 0; x < 9; x++)
    {
      for (y = 0; y < 9; y++)
      {
        matrix[x][y];
      }
    }
  }

  time = clock() - time;
  std::cout << "Clicks:     " << time << std::endl;
  std::cout << "Time taken: " << (double) time / CLOCKS_PER_SEC << "s" << std::endl;
}

Program 2 creates a 1D array and also measures how long it takes to access all of its elements 100000 times:

#include <time.h>
#include <iostream>

int main()
{
  clock_t time;
  int i, j;

  int vector[81] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,
                     9, 10, 11, 12, 13, 14, 15, 16, 17,
                    18, 19, 20, 21, 22, 23, 24, 25, 26,
                    27, 28, 29, 30, 31, 32, 33, 34, 35,
                    36, 37, 38, 39, 40, 41, 42, 43, 44,
                    45, 46, 47, 48, 49, 50, 51, 52, 53,
                    54, 55, 56, 57, 58, 59, 60, 61, 62,
                    63, 64, 65, 66, 67, 68, 69, 70, 71,
                    72, 73, 74, 75, 76, 77, 78, 79, 80};

  time = clock();

  for (i = 0; i < 100000; i++)
  {
    for (j = 0; j < 81; j++)
    {
      vector[j];
    }
  }

  time = clock() - time;
  std::cout << "Clicks:     " << time << std::endl;
  std::cout << "Time taken: " << (double) time / CLOCKS_PER_SEC << "s" << std::endl;
}

After executing program 1 my output is:

Clicks:     8106
Time taken: 0.008106s

After executing program 2 my output is:

Clicks:     15958
Time taken: 0.015958s

It is my understanding that a 1D array is stored in a continuous block of memory. Likewise the rows of a static 2D array are stored in contiguous blocks of memory. Conversely the rows of a dynamic 2d array might not be stored in contiguous blocks of memory. If this is true then program 2 should be at least similar in speed to program 1 therefore my question is why would program 1 be remarkably faster than program 2?

11
  • Correct me if I'm wrong, but doesn't c treat 2d and 1d arrays pretty much the same? Commented Nov 22, 2016 at 17:17
  • 1
    What compiler and OS did you test it on? Commented Nov 22, 2016 at 17:22
  • @KyleKW I believe so but the indexing would be different i.e in the 2D array you access each element with an index = width * column index + row index. My concern here is why would the 2D array be faster to access than the 1D array. Commented Nov 22, 2016 at 17:23
  • 1
    Following what I was saying, you may want to look at this answer (2nd and 3rd paragraph are talking about caching and prefetching ) Commented Nov 22, 2016 at 17:44
  • 1
    Trying to measure performance of such short period will often be biased due to noise caused by other processes - scheduler manages context switches on a level of 10-20 ms so you need your test to be atleast a magnitude slower for it to have no effect. After increasing number of iterations to 10000000 I got reverse results: 4.69s for 2D array and 3.44s for 1D array, in both cases error in measurement is less than 1%. Commented Nov 22, 2016 at 18:23

2 Answers 2

2

Here is what I found:

  1. If you actually make use of the value then the run time is almost the same, for example, change matrix[x][y]; to matrix[x][y] += 1; and vector[j]; to vector[j] += 1;

    >     Clicks:     28519
    >     Time taken: 0.028519s
    

    and

    >     Clicks:     29941
    >     Time taken: 0.029941s
    
  2. Without the above changes, optimize while compiling, g++ -O3 <filename>.cpp, this results in same time, got the same following output for both programs:

    $./a.out

    >     Clicks:     2
    >     Time taken: 2e-06s
    

So, what you are pointing out is compiler optimizations.

Sign up to request clarification or add additional context in comments.

2 Comments

Not sure why the downvote for you, but I countered it for now. I would lead off your answer with your conclusion though, because these questions come up so often and 99% of the time, it is because the are coming to conclusions on way to limited a set of data.
@michaelDorgan Thank you :)
1

The loops are likely to be removed (kind of optimising) by compiler, because

  1. You actually did nothing in loops.

  2. The matrix can be treated as a const array.

  3. program 1 is faster than program 2. ( :< )

To see whether the deletion happens to your code during compiling, you can increase the most outer loop by 100 times, and see whether the time needed for execution is increased significantly (not necessarily by exact 100 times).

If true, you can prevent this kind of optimising by doing some actual works in loop (calculate the sum, and don't forget printing it afterwards) and introduce some "unpredictable" changes to your matrix, for example:

srand(10);
for (int i=0; i<9; ++i) {
  matrix[i][i] = rand()%100;
}

And further, compiler may conduct some other optimising to your code, for example, expand your loops, even the address of the element you are visiting (they are no longer calculated at run time), you can prevent this by making the executing times of loops "unpredictable" :

#include <chrono>
#include <iostream>
#include <cstdlib>

int array[100];
int array2[10][10];

int64_t Sum1D(int len) {
  int64_t sum = 0;
  for (int i=0; i<100000; ++i) {
    for (int j=0; j<len; ++j) {
        sum += array[j];
    }
  }
  return sum;
}

int64_t Sum2D(int len1, int len2) {
  int64_t sum = 0;
  for (int i=0; i<100000; ++i) {
    for (int j=0; j<len1; ++j) {
      for (int k=0; k<len2; ++k)
        sum += array2[j][k];
    }
  }
  return sum;
}

int main()
{
  for (int i=0; i<100; ++i) {
    array[i] = rand();
    array2[i%10][i/10] = rand();
  }

  auto time = std::chrono::steady_clock::now();

  //int64_t sum = Sum1D(100);
  int64_t sum = Sum2D(10,10);

  auto duration = std::chrono::steady_clock::now()-time;
  std::cout << sum << "!" << duration.count() << std::endl;

  return 0;
} 

which finally makes program1 slower than program2. ( :> )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.