I read about List comprehension without [ ] in Python so now I know that
''.join([str(x) for x in mylist])
is faster than
''.join(str(x) for x in mylist)
because "list comprehensions are highly optimized"
So I suppose that the optimization relies on the parsing of the for expression, sees mylist, computes its length, and uses it to pre-allocate the exact array size, which saves a lot of reallocation.
When using ''.join(str(x) for x in mylist), join recieves a generator blindly and has to build its list without knowing the size in advance.
But now consider this:
mylist = [1,2,5,6,3,4,5]
''.join([str(x) for x in mylist if x < 4])
How does python decide of the size of the list comprehension? Is it computed from the size of mylist, and downsized when iterations are done (which could be very bad if the list is big and the condition filters out 99% of the elements), or does it revert back to the "don't know the size in advance" case?
EDIT: I've done some small benchmarks and it seems to confirm that there's an optimization:
without a condition:
import timeit
print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234]])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234])"))
yields (as expected):
3.11010817019474
3.3457350077491026
with a condition:
print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50)"))
yields:
2.7942209702566965
3.0316467566203276
so conditional listcomp still is faster.
for x in xx for y in yyin the usage of linked question andfor x in y if x<123in your question should be similar as in both the case Python do not know the size of resultant list until the expression is evaluated. (just the logical assumption, not sure if that is True)PyBytes_GET_SIZEand then you might find your answer.