I'm trying to create a more efficient way to sort lists and dictionaries in python and came across Efficient data structure keeping objects sorted on multiple keys. There the suggested solution was to use the heapq module.
However, in my tests a heap seems to be twice as slow as the native Python sorting algorithm. Below is the code I used to do a simple test. Results were for example:
Heap: 0.005993366241455078
Standard: 0.0020036697387695312
Is there a way to actually use a heap and increase the performance, as the post linked above claims? What would that code look like?
Here's the code to test it:
import random
import time
from heapq import *
standardlist = []
heaplist = []
for i in range(10000):
num = random.randint(0,10000)
standardlist.append(num)
heappush(heaplist, num)
# Standard sorting method:
start_time = time.time()
sorted_list = sorted(standardlist)
finish_time_1 = time.time() - start_time
# Heap sorting method:
start_time = time.time()
heap_sorted_list = [heappop(heaplist) for i in range(len(heaplist))]
finish_time_2 = time.time() - start_time
print("Standard Finish Time:", finish_time_1)
print("Heap Finish Time:", finish_time_2)
O(log n)time). If you delete all items, then it will takeO(n log n)time. Pythonsortuse a TimSort which also runs inO(n log n)but it is implemented in C, very optimized and TimSort is known to be usually faster than HeapSort. There is no chance you can be faster thansortwith this code. In fact, with any general-purpose sort written in pure-Python.k<<n. In fact, there are faster algorithm in this case too (partitioning can be done inO(n)and HeapSort takesO(n + k log n)not to mention the higher constant than optimized partitioning algorithms).