I am trying to understand the speed of sorting an already sorted list of objects in Python. Python uses Timsort algorithm which is said to perform O(n) on an already sorted list and O(nlogn) on average/worst case. In the code snippet below I perform two tests on a list of 20 million random numbers.
First test compares time needed for sorting an unsorted list of floats vs time needed for sorting the same list but already sorted. The time gain is roughly 7.5x (7.46 seconds vs 1.02 seconds) which makes sense.
In the second test, instead of using raw floats, I create a list of objects where each object holds one float number, then I perform the same sorting test. In that case sorting takes 10.08 seconds, then I sort again the already sorted list of objects, but time elapsed goes up from 10.08 seconds to 11.50 seconds which is counterintuitive to me. I understand that there might be some overhead from retrieving the object's attribute, however I would not expect the sorting of the already sorted list to take longer than sorting the same unsorted list.
My question is, why does it take longer to sort an already sorted list of objects than sorting the same unsorted list of objects?
import random
import time
class C:
def __init__(self, x):
self.x = x
random_nbrs = [random.random() for _ in range(20_000_000)]
t0 = time.time()
w = sorted(random_nbrs)
print("1. Sorting list of random numbers took {:.2f} seconds".format(time.time() - t0))
t0 = time.time()
x = sorted(w)
print("2. Sorting sorted list of random numbers took {:.2f} seconds".format(time.time() - t0))
objects = [C(e) for e in random_nbrs]
t0 = time.time()
y = sorted(objects, key=lambda x: x.x)
print("3. Sorting list of random objects took {:.2f} seconds".format(time.time() - t0))
t0 = time.time()
z = sorted(y, key=lambda x: x.x)
print("4. Sorting sorted list of random objects took {:.2f} seconds".format(time.time() - t0))
- Sorting list of random numbers took 7.46 seconds
- Sorting sorted list of random numbers took 1.02 seconds
- Sorting list of random objects took 10.08 seconds
- Sorting sorted list of random objects took 11.50 seconds
Cmethoddef __lt__(self, other): return self.x < other.xand drop the lambda function when sorting. Unfortunately most of the difference is that sorting the unsorted list of objects now takes much longer.