Performance of Python list comprehension: rebuilding lambda in `if` part?

Question

While comparing various equivalent forms of filter(xs, lambda x: x != el) in Python, I stumbled upon something that surprised me. Consider the following forms:

def method1(xs, el):
    p = lambda x: x != el
    return [x for x in xs if p(x)]

def method2(xs, el):
    return [x for x in xs if (lambda y: y != el)(x)]

I would expect that Python'd build the lambda only once, and then store it in a temporary variable, so that both forms perform about as well. Maybe even that method1 would perform worse due to the name lookup.

But when I benchmarked them, it turned out that method2 performed consistently worse than method1. Why is this? Is it rebuilding the lambda for every iteration?

My benchmark script (in a separate module, and expects methods to contain method1 and method2) is as follows:

import math, timeit

def bench(n,rho,z):
    pre = """\
import random
from methods import %(method)s

x = [(random.randint(0,%(domain)i)) for r in xrange(%(size)i)]
el = x[0]\
"""

    def testMethod(m):
        mod = pre % { 'method': m, 'domain': int(math.ceil(n / rho)), 'size': n }
        return timeit.timeit("%s(x, el)" % m, mod, number = z)/(z * n)

    print "Testing", n, rho, z
    return tuple(testMethod(m) for m in ("method1", "method2"))

n = 31

min_size, max_size = 10.0**1, 10.0**4
size_base = math.pow(max_size / min_size, 1.0/(n-1))
# size_default = 10**3

#min_sel, max_sel = 0.001, 1.0
#sel_base = math.pow(max_sel / min_sel, 1.0/(n-1))
sel_default = 0.001

tests = [bench(int(min_size*size_base**x), sel_default, 100) for x in xrange(n)]
#tests = [bench(size_default, min_sel*sel_base**x, 100) for x in xrange(n)]

def median(x):
    x = list(sorted(x))
    mi = int(len(x)/2)
    if n % 2 == 0:
        return x[mi]
    else:
        return (x[mi] + x[mi+1])/2

def madAndMedian(x):
    meh = median(x)
    return meh, median([abs(xx - meh) for xx in x])

for z in zip(*tests):
    print madAndMedian(z)

Martijn Pieters · Accepted Answer · 2013-01-09 13:03:23Z

4

Yes, it is rebuilding the lambda on every loop; it needs to reevaluate that whole expression.

To see this, use the dis module:

>>> dis.dis(method1)
  2           0 LOAD_CLOSURE             0 (el)
              3 BUILD_TUPLE              1
              6 LOAD_CONST               1 (<code object <lambda> at 0x102000230, file "<stdin>", line 2>)
              9 MAKE_CLOSURE             0
             12 STORE_FAST               2 (p)

  3          15 BUILD_LIST               0
             18 LOAD_FAST                0 (xs)
             21 GET_ITER            
        >>   22 FOR_ITER                24 (to 49)
             25 STORE_FAST               3 (x)
             28 LOAD_FAST                2 (p)
             31 LOAD_FAST                3 (x)
             34 CALL_FUNCTION            1
             37 POP_JUMP_IF_FALSE       22
             40 LOAD_FAST                3 (x)
             43 LIST_APPEND              2
             46 JUMP_ABSOLUTE           22
        >>   49 RETURN_VALUE        
>>> dis.dis(method2)
  2           0 BUILD_LIST               0
              3 LOAD_FAST                0 (xs)
              6 GET_ITER            
        >>    7 FOR_ITER                33 (to 43)
             10 STORE_FAST               2 (x)
             13 LOAD_CLOSURE             0 (el)
             16 BUILD_TUPLE              1
             19 LOAD_CONST               1 (<code object <lambda> at 0x101fd37b0, file "<stdin>", line 2>)
             22 MAKE_CLOSURE             0
             25 LOAD_FAST                2 (x)
             28 CALL_FUNCTION            1
             31 POP_JUMP_IF_FALSE        7
             34 LOAD_FAST                2 (x)
             37 LIST_APPEND              2
             40 JUMP_ABSOLUTE            7
        >>   43 RETURN_VALUE

The LOAD_CONST opcode loads the compiled code for the lambda body; MAKE_CLOSURE creates the lambda from that. For method1 this happens once, while for method2 this is repeated in each iteration of the loop (from the FOR_ITER opcode to the JUMP_ABSOLUTE opcode); note the LOAD_FAST opcode for the variable p in method1 where it refers to the local variable instead.

edited Jan 9, 2013 at 13:03

answered Jan 9, 2013 at 12:52

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user824425 Over a year ago

You're practically tied for time with @cmh, but thanks for the fast answer! :D Do you mind if I give the +15 to the one with least rep?

Martijn Pieters Over a year ago

@Tinctorius: It's your accept flag, give it to whomever you feel deserves it most! :-)

user824425 Over a year ago

And then you sneakily make your answer a little more precise. I'm onto you! ;)

Martijn Pieters Over a year ago

@Tinctorius: I always try to make my answers just that little better, as I think of more details to add. Sorry!

cmh Over a year ago

@Tinctorius, I would suggest accepting this answer and I'll delete mine. Having a duplicate (but slightly inferior) answer around isn't making the internet a better place.

user4815162342 · Accepted Answer · 2013-01-09 12:57:05Z

2

Python builds the code of the lambda only once. But it builds a new function object (that refers to the code and the enclosing environment that includes the el variable) in each pass of the loop. This would normally be a feature, as each function object could have different properties assigned to it via __setitem__.

In this case the function object is not stored anywhere nor leaked outside the scope, so it would be safe to move its the creation outside the loop, but Python's compiler is not yet smart enough to perform such an optimization.

answered Jan 9, 2013 at 12:57

user4815162342

159k22 gold badges350 silver badges418 bronze badges

5 Comments

Martijn Pieters Over a year ago

For a lambda it builds a closure, which is not quite the same as a function. :-) But yes, it only compiles once, but the closure building does take some time. As for 'smart enough'; the dynamic nature of the language makes it really hard to do such optimizations safely; PyPy does show that a lot more could be done than what CPython does now though.

user4815162342 Over a year ago

@MartijnPieters Unlike some other languages, Python doesn't really distinguish functions and closures on the data type level. Functions and closures created from Python are both represented by the function data type, whose func_closure attributes defines the cells captured by the closure (or None/NULL if nothing is closed over).

Martijn Pieters Over a year ago

The bytecode for functions and lambdas differs; functions are loaded with MAKE_FUNCTION, lambdas with LOAD_CLOSURE followed by MAKE_CLOSURE. That's all I wanted to point out. :-)

user4815162342 Over a year ago

@MartijnPieters Yes, the bytecode to create a function must include information on what part of the lexical environment to capture, if any. The resulting function objects are, however, of the exact same type. The distinction between MAKE_FUNCTION and MAKE_CLOSURE appears to be an artifact of history from the times when Python didn't have closures (and possibly a useful optimization for the simple function case).

Martijn Pieters Over a year ago

And yes, both those bytecodes (MAKE_CLOSURE and MAKE_FUNCION) result in a function() instance. It's just the way the bytecode is handled needs to differ, because a lambda only supports a simple expression.

Collectives™ on Stack Overflow

Performance of Python list comprehension: rebuilding lambda in `if` part?

2 Answers 2

5 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related