Why would 2D array access be faster than 1D array access?

Question

I have two programs, compiled with g++ and executed on linux. Program 1 creates a 2D array and then measures how long it takes to access all of its elements 100000 times:

#include <time.h>
#include <iostream>

int main()
{
  clock_t time;
  int i, y, x;

  int matrix[9][9]{{ 0,  1,  2,  3,  4,  5,  6,  7,  8},
                   { 9, 10, 11, 12, 13, 14, 15, 16, 17},
                   {18, 19, 20, 21, 22, 23, 24, 25, 26},
                   {27, 28, 29, 30, 31, 32, 33, 34, 35},
                   {36, 37, 38, 39, 40, 41, 42, 43, 44},
                   {45, 46, 47, 48, 49, 50, 51, 52, 53},
                   {54, 55, 56, 57, 58, 59, 60, 61, 62},
                   {63, 64, 65, 66, 67, 68, 69, 70, 71},
                   {72, 73, 74, 75, 76, 77, 78, 79, 80}};

  time = clock();

  for (i = 0; i < 100000; i++)
  {
    for (x = 0; x < 9; x++)
    {
      for (y = 0; y < 9; y++)
      {
        matrix[x][y];
      }
    }
  }

  time = clock() - time;
  std::cout << "Clicks:     " << time << std::endl;
  std::cout << "Time taken: " << (double) time / CLOCKS_PER_SEC << "s" << std::endl;
}

Program 2 creates a 1D array and also measures how long it takes to access all of its elements 100000 times:

#include <time.h>
#include <iostream>

int main()
{
  clock_t time;
  int i, j;

  int vector[81] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,
                     9, 10, 11, 12, 13, 14, 15, 16, 17,
                    18, 19, 20, 21, 22, 23, 24, 25, 26,
                    27, 28, 29, 30, 31, 32, 33, 34, 35,
                    36, 37, 38, 39, 40, 41, 42, 43, 44,
                    45, 46, 47, 48, 49, 50, 51, 52, 53,
                    54, 55, 56, 57, 58, 59, 60, 61, 62,
                    63, 64, 65, 66, 67, 68, 69, 70, 71,
                    72, 73, 74, 75, 76, 77, 78, 79, 80};

  time = clock();

  for (i = 0; i < 100000; i++)
  {
    for (j = 0; j < 81; j++)
    {
      vector[j];
    }
  }

  time = clock() - time;
  std::cout << "Clicks:     " << time << std::endl;
  std::cout << "Time taken: " << (double) time / CLOCKS_PER_SEC << "s" << std::endl;
}

After executing program 1 my output is:

Clicks:     8106
Time taken: 0.008106s

After executing program 2 my output is:

Clicks:     15958
Time taken: 0.015958s

It is my understanding that a 1D array is stored in a continuous block of memory. Likewise the rows of a static 2D array are stored in contiguous blocks of memory. Conversely the rows of a dynamic 2d array might not be stored in contiguous blocks of memory. If this is true then program 2 should be at least similar in speed to program 1 therefore my question is why would program 1 be remarkably faster than program 2?

Correct me if I'm wrong, but doesn't c treat 2d and 1d arrays pretty much the same? — KyleKW
– KyleKW, Commented Nov 22, 2016 at 17:17
@KyleKW I believe so but the indexing would be different i.e in the 2D array you access each element with an index = width * column index + row index. My concern here is why would the 2D array be faster to access than the 1D array. — JeliBeanMachine
– JeliBeanMachine, Commented Nov 22, 2016 at 17:23
Following what I was saying, you may want to look at this answer (2nd and 3rd paragraph are talking about caching and prefetching ) — Wexiwa
– Wexiwa, Commented Nov 22, 2016 at 17:44
Trying to measure performance of such short period will often be biased due to noise caused by other processes - scheduler manages context switches on a level of 10-20 ms so you need your test to be atleast a magnitude slower for it to have no effect. After increasing number of iterations to 10000000 I got reverse results: 4.69s for 2D array and 3.44s for 1D array, in both cases error in measurement is less than 1%. — thorhunter
– thorhunter, Commented Nov 22, 2016 at 18:23

Enigma · Accepted Answer · 2016-11-22 18:48:05Z

2

Here is what I found:

If you actually make use of the value then the run time is almost the same, for example, change matrix[x][y]; to matrix[x][y] += 1; and vector[j]; to vector[j] += 1;
```
>     Clicks:     28519
>     Time taken: 0.028519s
```
and
```
>     Clicks:     29941
>     Time taken: 0.029941s
```
Without the above changes, optimize while compiling, g++ -O3 <filename>.cpp, this results in same time, got the same following output for both programs:

$./a.out
```
>     Clicks:     2
>     Time taken: 2e-06s
```

So, what you are pointing out is compiler optimizations.

edited Nov 22, 2016 at 18:48

answered Nov 22, 2016 at 18:21

Enigma

3291 silver badge10 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Michael Dorgan Over a year ago

Not sure why the downvote for you, but I countered it for now. I would lead off your answer with your conclusion though, because these questions come up so often and 99% of the time, it is because the are coming to conclusions on way to limited a set of data.

Enigma Over a year ago

@michaelDorgan Thank you :)

felix · Accepted Answer · 2016-11-22 19:24:04Z

The loops are likely to be removed (kind of optimising) by compiler, because

You actually did nothing in loops.
The matrix can be treated as a const array.
program 1 is faster than program 2. ( :< )

To see whether the deletion happens to your code during compiling, you can increase the most outer loop by 100 times, and see whether the time needed for execution is increased significantly (not necessarily by exact 100 times).

If true, you can prevent this kind of optimising by doing some actual works in loop (calculate the sum, and don't forget printing it afterwards) and introduce some "unpredictable" changes to your matrix, for example:

srand(10);
for (int i=0; i<9; ++i) {
  matrix[i][i] = rand()%100;
}

And further, compiler may conduct some other optimising to your code, for example, expand your loops, even the address of the element you are visiting (they are no longer calculated at run time), you can prevent this by making the executing times of loops "unpredictable" :

#include <chrono>
#include <iostream>
#include <cstdlib>

int array[100];
int array2[10][10];

int64_t Sum1D(int len) {
  int64_t sum = 0;
  for (int i=0; i<100000; ++i) {
    for (int j=0; j<len; ++j) {
        sum += array[j];
    }
  }
  return sum;
}

int64_t Sum2D(int len1, int len2) {
  int64_t sum = 0;
  for (int i=0; i<100000; ++i) {
    for (int j=0; j<len1; ++j) {
      for (int k=0; k<len2; ++k)
        sum += array2[j][k];
    }
  }
  return sum;
}

int main()
{
  for (int i=0; i<100; ++i) {
    array[i] = rand();
    array2[i%10][i/10] = rand();
  }

  auto time = std::chrono::steady_clock::now();

  //int64_t sum = Sum1D(100);
  int64_t sum = Sum2D(10,10);

  auto duration = std::chrono::steady_clock::now()-time;
  std::cout << sum << "!" << duration.count() << std::endl;

  return 0;
}

which finally makes program1 slower than program2. ( :> )

Collectives™ on Stack Overflow

Why would 2D array access be faster than 1D array access?

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related