How does python optimize conditional list comprehensions

Question

I read about List comprehension without [ ] in Python so now I know that

''.join([str(x) for x in mylist])

is faster than

''.join(str(x) for x in mylist)

because "list comprehensions are highly optimized"

So I suppose that the optimization relies on the parsing of the for expression, sees mylist, computes its length, and uses it to pre-allocate the exact array size, which saves a lot of reallocation.

When using ''.join(str(x) for x in mylist), join recieves a generator blindly and has to build its list without knowing the size in advance.

But now consider this:

mylist = [1,2,5,6,3,4,5]
''.join([str(x) for x in mylist if x < 4])

How does python decide of the size of the list comprehension? Is it computed from the size of mylist, and downsized when iterations are done (which could be very bad if the list is big and the condition filters out 99% of the elements), or does it revert back to the "don't know the size in advance" case?

EDIT: I've done some small benchmarks and it seems to confirm that there's an optimization:

without a condition:

import timeit

print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234]])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234])"))

yields (as expected):

3.11010817019474
3.3457350077491026

with a condition:

print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50)"))

yields:

2.7942209702566965
3.0316467566203276

so conditional listcomp still is faster.

Does this answer your question: List comprehension vs generator expression's weird timeit results? — Moinuddin Quadri
– Moinuddin Quadri, Commented Jan 7, 2017 at 9:16
not bad, but in this question there's never a condition after the for loop. Only in the expression itself, which means that the size is known in advance. — Jean-François Fabre
– Jean-François Fabre ♦, Commented Jan 7, 2017 at 9:20
I think the behavior on how Python deals with for x in xx for y in yy in the usage of linked question and for x in y if x<123 in your question should be similar as in both the case Python do not know the size of resultant list until the expression is evaluated. (just the logical assumption, not sure if that is True) — Moinuddin Quadri
– Moinuddin Quadri, Commented Jan 7, 2017 at 9:24
I suggest you to look at the source code: github.com/python/cpython/blob/master/Objects/stringlib/join.h — Kasravnd
– Kasravnd, Commented Jan 7, 2017 at 9:27
@Jean-FrançoisFabre I've understood your question. If you rally delve into the source you'll realize the you need to seek PyBytes_GET_SIZE and then you might find your answer. — Kasravnd
– Kasravnd, Commented Jan 7, 2017 at 9:32

user2357112 · Accepted Answer · 2017-01-07 09:48:12Z

13

List comprehensions don't pre-size the list, even when they totally could. You're assuming the presence of an optimization that isn't actually done.

The list comprehension is faster because all the iterator machinery and the work of entering and exiting the genexp stack frame has a cost. The list comprehension doesn't need to pay that cost.

answered Jan 7, 2017 at 9:48

user2357112

286k32 gold badges490 silver badges571 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jongware Over a year ago

Could there be a substantial gain, then, in preallocating the list? As @Jean-François noted, this "could be bad if the list is big and the condition filters out 99% of the elements", but then again, it could be good if the condition filtered out only 1% of the elements!

Jean-François Fabre Over a year ago

@RadLexus exactly: if the collection iterated upon is a list or something with a known-in-advance size and no condition, no double "for" loop, it could be a special case (very common) to pre-allocate the list size. Would be applicable 80% of the time in the existing codes!

Collectives™ on Stack Overflow

How does python optimize conditional list comprehensions

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related