4

Which of these would be more computationally efficient, and why?

A) Repeated array access:

for(i=0; i<numbers.length; i++) {
    result[i] = numbers[i] * numbers[i] * numbers[i];
}

B) Setting a local variable:

for(i=0; i<numbers.length; i++) {
    int n = numbers[i];
    result[i] = n * n * n;
}

Would not the repeated array access version have to be calculated (using pointer arithmetic), making the first option slower because it is doing this?:

for(i=0; i<numbers.length; i++) {
    result[i] = *(numbers + i) * *(numbers + i) * *(numbers + i);
}
12
  • You could run the code using a Timer object? Would be interesting to see what the results are. I mention this as you already have it coded :) Commented Oct 8, 2013 at 9:43
  • I suppose that the compiler will generate the same instructions. Commented Oct 8, 2013 at 9:50
  • @Michael You're saying that as if every compiler on every platform behaved the same. Commented Oct 8, 2013 at 9:52
  • 2
    As with all benchmark questions, the only way to reliable determine the answer is to benchmark the performance in your target environment. This greatly depends on how well you compiler optimizes, whether you target architecture has special instructions to support such operations and probably other factors. Commented Oct 8, 2013 at 9:55
  • 2
    Accessing numbers.length but also using the subscription operator makes me think that the c tag is actually inappropriate. Commented Oct 8, 2013 at 9:55

3 Answers 3

15

Any sufficiently sophisticated compiler will generate the same code for all three solutions. I turned your three versions into a small C program (with a minor adjustement, I changed the access numbers.length to a macro invocation which gives the length of an array):

#include <stddef.h>

size_t i;
static const int numbers[] = { 0, 1, 2, 4, 5, 6, 7, 8, 9 };

#define ARRAYLEN(x) (sizeof((x)) / sizeof(*(x)))
static int result[ARRAYLEN(numbers)];

void versionA(void)
{
    for(i=0; i<ARRAYLEN(numbers); i++) {
        result[i] = numbers[i] * numbers[i] * numbers[i];
    }
}

void versionB(void)
{
    for(i=0; i<ARRAYLEN(numbers); i++) {
        int n = numbers[i];
        result[i] = n * n * n;
    }
}

void versionC(void)
{
    for(i=0; i<ARRAYLEN(numbers); i++) {
            result[i] = *(numbers + i) * *(numbers + i) * *(numbers + i);
    }
}

I then compiled it using optimizations (and debug symbols, for prettier disassembly) with Visual Studio 2012:

C:\Temp>cl /Zi /O2 /Wall /c so19244189.c
Microsoft (R) C/C++ Optimizing Compiler Version 17.00.50727.1 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

so19244189.c

Finally, here's the disassembly:

C:\Temp>dumpbin /disasm so19244189.obj

[..]

_versionA:
  00000000: 33 C0              xor         eax,eax
  00000002: 8B 0C 85 00 00 00  mov         ecx,dword ptr _numbers[eax*4]
            00
  00000009: 8B D1              mov         edx,ecx
  0000000B: 0F AF D1           imul        edx,ecx
  0000000E: 0F AF D1           imul        edx,ecx
  00000011: 89 14 85 00 00 00  mov         dword ptr _result[eax*4],edx
            00
  00000018: 40                 inc         eax
  00000019: 83 F8 09           cmp         eax,9
  0000001C: 72 E4              jb          00000002
  0000001E: A3 00 00 00 00     mov         dword ptr [_i],eax
  00000023: C3                 ret

_versionB:
  00000000: 33 C0              xor         eax,eax
  00000002: 8B 0C 85 00 00 00  mov         ecx,dword ptr _numbers[eax*4]
            00
  00000009: 8B D1              mov         edx,ecx
  0000000B: 0F AF D1           imul        edx,ecx
  0000000E: 0F AF D1           imul        edx,ecx
  00000011: 89 14 85 00 00 00  mov         dword ptr _result[eax*4],edx
            00
  00000018: 40                 inc         eax
  00000019: 83 F8 09           cmp         eax,9
  0000001C: 72 E4              jb          00000002
  0000001E: A3 00 00 00 00     mov         dword ptr [_i],eax
  00000023: C3                 ret

_versionC:
  00000000: 33 C0              xor         eax,eax
  00000002: 8B 0C 85 00 00 00  mov         ecx,dword ptr _numbers[eax*4]
            00
  00000009: 8B D1              mov         edx,ecx
  0000000B: 0F AF D1           imul        edx,ecx
  0000000E: 0F AF D1           imul        edx,ecx
  00000011: 89 14 85 00 00 00  mov         dword ptr _result[eax*4],edx
            00
  00000018: 40                 inc         eax
  00000019: 83 F8 09           cmp         eax,9
  0000001C: 72 E4              jb          00000002
  0000001E: A3 00 00 00 00     mov         dword ptr [_i],eax
  00000023: C3                 ret

Note how the assembly is exactly the same in all cases. So the correct answer to your question

Which of these would be more computationally efficient, and why?

for this compiler is: mu. Your question cannot be answered because it's based on incorrect assumptions. None of the answers is faster than any other.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for this.. nice to see the assembly
2

The theoretical answer:

A reasonably good optimizing compiler should convert version A to version B, and perform only one load from memory. There should be no performance difference if optimization is enabled.

If optimization is disabled, version A will be slower, because the address must be computed 3 times and there are 3 memory loads (2 of them are cached and very fast, but it's still slower than reusing a register).

In practice, the answer will depend on your compiler, and you should check this by benchmarking.

Comments

1

It depends on compiler but all of them should be the same.

First lets look at case B smart compiler will generate code to load value into register only once so it doesn't matter if you use some additional variable or not, compiler generates opcode for mov instruction and has value into register. So B is the same as A.

Now lets compare A and C. We should look at opeators [] inline implementation. a[b] actually is *(a + b) so *(numbers + i) the same as numbers[i] that means cases A and C are the same.

So we have (A==B) && (A==C) all in all (A==B==C) If you know what I mean :).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.