1

The code look like this sum += array[j] + array[j+1] + array [j + 2]+ ... array[j + n]; how do I replace the j+n inside the bracket to improve the timing?

1
  • 2
    Is this a theoretical EECS class exercise, or a real-world question? Are you measuring timing as 'unoptimized instruction count'? Are we supposed to assume any particular instruction set (MIPS, x86, ?) Commented Nov 24, 2010 at 0:33

3 Answers 3

7

You don't do this. Unless you have a brain-dead compiler, it should be able to optimise this quite adequately.

If you're going to do this level of micro-optimisation, you need to start looking at the underlying assembly code, not assuming that your compiler will blindly translate your code into exactly-equivalent assembly code.

You will also need to understand the intricacies of your target platform better than the people that write your compiler which is frankly, based on the insane code I've seen coming from gcc in high optimisation levels, unlikely :-)

You'll usually get a better return on investment if you concentrate on big-picture optimisations like algorithm selection and so forth.

What you should do (if you haven't already) is profile the code, once finished, to see where the bottlenecks are (and only if it's underperforming: there's no point in optimising something that's already running fast enough).

Then concentrate on those bottlenecks. Measure, don't guess!


By way of example, I was actually going to show you how well optimised the following code became:

#include <stdio.h>
int main(void) {
    int j, sum, array[50];
    for (j = 0; j < 50; j++)
        array[j] = 999 - j * 2;
    j = 22;
    sum = array[j] + array[j+1] + array [j + 2] + array[j + 3];
    return 0;
}

but under gcc optimisation level 3, it became:

main:
    pushl   %ebp         ; prolog
    xorl    %eax, %eax   ; return value
    movl    %esp, %ebp   ; epilog 1
    popl    %ebp         ; epilog 2
    ret

Yes, that's right, it's just the stack prolog and epilog code and setting of the return value, with no calculations in sight. gcc has (rightly) figured out that none of the calculations are used anywhere so has optimised them totally out of existence.


Once you use it, the relevant code becomes a simple:

movl    116(%esp), %eax
addl    112(%esp), %eax
addl    120(%esp), %eax
addl    124(%esp), %eax

and you'd be hard-pressed getting it much more optimised than that.

Sign up to request clarification or add additional context in comments.

4 Comments

The chalenge is no optimization is allowed
Wooo, dead code removal. The j+1...j+n stuff could (should?) be in a loop too.
@vincent: Add some context to your question and you won't get useless answers.
@vincent, if you're not allowed optimisation, then it's probably a homework question and you should have stated so (or at least included that limitation in the question). In the real world, optimisation is allowed. In any case, the crux of my answer stands: examine the assembly output and test different scenarios. You cannot make a blanket assertion that a piece of C code will run faster in all environments.
3

Why not

T* aj = &(array[j]);
sum = aj[0] + aj[1] + ...

?

3 Comments

This just makes the code look different. It won't affect the timing.
@Graeme: it really depends on optimizer. Some embedded compilers are not particularly good in optimizations, so this actually may bring some performance gain.
fair enough. I take back my -1.
0
int *aj

aj = &(array[j]);

is correct, and the sum is equal to

sum += *aj + *(aj+1) + *(aj+2) ....

It defenitly would perform your code if you start whit

aj = &(array[0]);

and then just make the adds to aj

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.