I'm learning the well known sorting algorithms and implementing them myself. I have recently done merge sort and the code I have is:
def merge(l, r, direction):
# print('merging')
# print(l, r)
# adding infinity to end of list so we know when we've hit the bottom of one pile
l.append(inf)
r.append(inf)
A = []
i, j = 0, 0
while (i < len(l)) and (j < len(r)):
if l[i] <= r[j]:
A.append(l[i])
i += 1
else:
A.append(r[j])
j += 1
# removing infinity from end of list
A.pop()
return(A)
def merge_sort(num_lst, direction='increasing', level=0):
if len(num_lst) > 1:
mid = len(num_lst)//2
l = num_lst[:mid]
r = num_lst[mid:]
l = merge_sort(l, level=level + 1)
r = merge_sort(r, level=level + 1)
num_lst = merge(l, r, direction)
return num_lst
What I have seen in other implementations that differs from my own is in the merging of lists. Where I just create a blank list and append elements in numerical order other pass the preexisting into merge and overwrite each element to create a list in numerical order. Something like:
def merge(arr, l, m, r):
n1 = m - l + 1
n2 = r- m
# create temp arrays
L = [0] * (n1)
R = [0] * (n2)
# Copy data to temp arrays L[] and R[]
for i in range(0 , n1):
L[i] = arr[l + i]
for j in range(0 , n2):
R[j] = arr[m + 1 + j]
# Merge the temp arrays back into arr[l..r]
i = 0 # Initial index of first subarray
j = 0 # Initial index of second subarray
k = l # Initial index of merged subarray
while i < n1 and j < n2 :
if L[i] <= R[j]:
arr[k] = L[i]
i += 1
else:
arr[k] = R[j]
j += 1
k += 1
# Copy the remaining elements of L[], if there
# are any
while i < n1:
arr[k] = L[i]
i += 1
k += 1
# Copy the remaining elements of R[], if there
# are any
while j < n2:
arr[k] = R[j]
j += 1
k += 1
I'm curious about the following:
Problem
Is using a append() on a blank list a bad idea? By my understanding when python creates a list it grabs a certain size chunk of memory and if our list grows beyond that copies the list to a different and larger section of memory (which seems it would be pretty high cost if it happened even once for a large list let alone repeatedly).
Is there just a higher cost to using append() compared to accessing a list by index? It would seemed to me append would be able to do things at a fairly low cost.