4

Why is the nested for loop faster than the single for loop?

start = time()

k = 0
m = 0

for i in range(1000):
    for j in range(1000):
        for l in range(100):
            m+=1

#for i in range(100000000):
#    k +=1

print int(time() - start)

For the single for loop I get a time of 14 seconds and for the nested for loop of 10 seconds

2
  • 1
    are you using Python2? probably yes. Commented Nov 12, 2018 at 15:38
  • 1
    For context, read this. Commented Nov 12, 2018 at 15:41

4 Answers 4

1

The relevant context is explained in this topic.

In short, range(100000000) builds a huge list in Python 2, whereas with the nested loops you only build lists with a total of 1000 + 1000 + 100 = 2100 elements. In Python 3, range is smarter and lazy like xrange in Python 2.

Here are some timings for the following code. Absolute runtime depends on the system, but comparing the values with each other is valuable.

import timeit

runs = 100

code = '''k = 0
for i in range(1000):
    for j in range(1000):
        for l in range(100):
            k += 1'''

print(timeit.timeit(stmt=code, number=runs))

code = '''k = 0
for i in range(100000000):
    k += 1'''

print(timeit.timeit(stmt=code, number=runs))

Outputs:

CPython 2.7 - range

264.650791883
372.886064053

Interpretation: building huge lists takes time.

CPython 2.7 - range exchanged with xrange

231.975350142
221.832423925

Interpretation: almost equal, as expected. (Nested for loops should have slightly larger overhead than a single for loop.)

CPython 3.6 - range

365.20924194483086
437.26447860104963

Interpretation: Interesting! I did not expect this. Anyone?

Sign up to request clarification or add additional context in comments.

Comments

1

It is because you are using Python2. Range generates a list of numbers, and has to allocate that list. In the first nested loop you are allocating 1000 + 1000 + 100, so the list size is 2100, while in the other one the list has a size of 100000000, which is much bigger.

In python2 is better to use a generator, xrange(), a generator yields the numbers instead of building and allocating a list with them.

Aditionally and for further information you can read this question that it is related to this but in python3

Comments

1

In Python 2, range creates a list with all of the numbers within the list. Try swapping range with xrange and you should see them take comparable time or the single loop approach may work a bit faster.

Comments

1

during the nested loops python has to allocate 1000+1000+100=2100 values for the counters whereas in the single loop it has to allocate 10M. This is what's taking the extra time

i have tested this in python 3.6 and the behaviour is similar, i would say it's very likely this is a memory allocation issue.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.