Does Python automatically optimize/cache function calls?

Question

I'm relatively new to Python, and I keep seeing examples like:

def max_wordnum(texts):
    count = 0
    for text in texts:
        if len(text.split()) > count:
            count = len(text.split())
    return count

Is the repeated len(text.split()) somehow optimized away by the interpreter/compiler in Python, or will this just take twice the CPU cycles of storing len(text.split()) in a variable?

No, it's not. The call will be done twice. So, there's room for optimizing the code. — Klaus D.
– Klaus D., Commented Aug 11, 2018 at 19:35
Good example would write the function like this: max(len(text.split()) for text in texts) — Daniel
– Daniel, Commented Aug 11, 2018 at 20:00

user2864740 · Accepted Answer · 2018-08-15 00:00:46Z

Duplicate expressions are not "somehow optimized away". Use a local variable to capture and re-use a result that is 'known not to change' and 'takes some not-insignificant time' to create; or where using a variable increases clarity.

In this case, it's impossible for Python to know that 'text.split()' is pure - a pure function is one with no side-effects and always returns the same value for the given input.

Trivially: Python, being a dynamically-typed language, doesn't even know the type of 'text' before it actually gets a value, so generalized optimization of this kind is not possible. (Some classes may provide their own internal 'cache optimizations', but digressing..)

As: even a language like C#, with static typing, won't/can't optimize away general method calls - as, again, there is no basic enforceable guarantee of purity in C#. (ie. What if the method returned a different value on the second call or wrote to the console?)

But: a Haskell, a Purely Functional language, has the option to not 'evaluate' the call twice, being a different language with different rules...

Jean-François Fabre · Accepted Answer · 2018-08-11 20:04:25Z

Even if python did optimize this (which isn't the case), the code is copy/paste all over and more difficult to maintain, so creating a variable to hold the result of a complex computation is always a good idea.

A better idea yet is to use max with a key function in this case:

return max(len(text.split()) for text in texts)

this is also faster.

Also note that len(text.split()) creates a list and you just count the items. A better way would be to count the spaces (if words are separated by only one space) by doing

return max(text.count(" ") for text in texts) + 1

if there can be more than 1 space, use regex and finditer to avoid creating lists:

return max(sum(1 for _ in re.finditer("\s+",text)) for text in texts) + 1

note the 1 value added in the end to correct the value (number of separators is one less than the number of words)

As an aside, even if the value isn't cached, you still can use complex expressions in loops with range:

for i in range(len(text.split())):

the range object is created at the start, and the expression is only evaluated once (as opposed as C loops for instance)

Collectives™ on Stack Overflow

Does Python automatically optimize/cache function calls?

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related