2

#include <iostream>
#include <ctime>

#define TIME(t) {std::cout << ((double)(clock() - (t)) / CLOCKS_PER_SEC);}

volatile long int limit = 10000000000;

void l2(int& a) {a++;}

void f(int& a)
{
  auto l1 = [&a]()
  {
    a++;
  };

  clock_t clk = clock();
  for(int i=0;i<limit;i++)
  {
    l1();
  }
  TIME(clk) // 4.07 s

  a=5;
  clk = clock();
  for(int i=0;i<limit;i++)
  {
    l2(a);
  }
  TIME(clk) // 4.32 s
}

int main()
{
  int a = 5;
  f(a);
  return 0;
}

Why is calling a lambda function faster?

Using gcc 4.8 with O3

3
  • 3
    They're not equivalent. One of them passes a parameter, the other doesn't. Commented Oct 7, 2015 at 19:53
  • Both require a reference of 'a' and both increment it. What is the inherit differences? Commented Oct 7, 2015 at 19:55
  • One is a parameter that is passed on every call, the other is a field that is only constructed once..? Commented Oct 7, 2015 at 19:56

2 Answers 2

8

Lambda loop disassembly: (using godbolt gcc 4.8.2 -O3 in C++11 mode)

movq    limit(%rip), %rax
testq   %rax, %rax
jle .L7
movl    (%rbx), %eax
movl    $1, %edx
.L8:
movq    limit(%rip), %rcx
movq    %rdx, %rsi
leal    (%rax,%rdx), %edi
addq    $1, %rdx
cmpq    %rcx, %rsi
jl  .L8
movl    %edi, (%rbx)

Function call loop disassembly:

movq    limit(%rip), %rax
testq   %rax, %rax
jle .L5
movl    (%rbx), %eax
movl    $1, %edx
.L10:
movq    limit(%rip), %rcx
movq    %rdx, %rsi
leal    (%rax,%rdx), %edi
addq    $1, %rdx
cmpq    %rcx, %rsi
jl  .L10
movl    %edi, (%rbx)

The two loops compile down to identical code.

Any difference is due to the order you did the operation, or random chance.

In general, lambdas are easier to inline, because the operation of () is defined by the type of the variable, not the value. And propagating values and using them to optimize is a touch harder than doing the same with types.

The classic example is using qsort vs std::sort.

Sign up to request clarification or add additional context in comments.

Comments

1

Conceptually, the function can work on other variables, but the lambda only works on a.
Clearly, the more flexibility you have, the more you can expect to pay for it, as in this case.
Here, what is happening is that passing a parameter to the function on every call is more expensive than not doing so, and the compiler hasn't been able to optimize this away, hence the difference.

2 Comments

I'd be surprised if both lambda and function calls weren't inlined, and I doubt the compiler will have any trouble optimizing this simple code. Since @Kam only says one is faster than the other, we don't know how big is the difference. Who knows, it might be insignificant and thus meaningless...
@eran: It doesn't have much to do with inlining (I assumed both are inlined), but rather with loop invariant code motion. The point is that the same variable is being passed to the function multiple times in one case, but not the other. The compiler has to be pretty smart in order to move out the loop invariant, and while it's not impossible by any means, it's not surprising to me that some compilers might not do this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.