0

I am implementing some sorting algorithm in Python and encountered a somewhat strange bug related to max recursion depth being exceeded. In one module, I have implemented an insertion,m erge, and quicksort. I then run some unit tests in a separate module. The quicksort on large values is giving me recursion depth exceeded errors. Here is my implementation of the algorithm in 'Sorter.py':

class QuickSort(Sorter):
    """Implements quicksort algorithm.
    Best case: O(n*log(n))
    Worst case: O(n^2)
    Average case: O(n*log(n))
    """
    @staticmethod
    def partition(numbers, low, high):
        pivot_index = low #crappy way to pick a pivot but it'll do
        pivot_val = numbers[pivot_index]

        #swap pivot with high
        temp = numbers[high]
        numbers[high] = pivot_val
        numbers[pivot_index] = temp

        store_index = low
        for index in range(low, high):
            if numbers[index] < pivot_val:
                #Swap 'index' with 'store_index'
                temp = numbers[store_index]
                numbers[store_index] = numbers[index]
                numbers[index] = temp

                store_index += 1

        #Swap pivot_val at high into appropriate index
        temp = numbers[store_index]
        numbers[store_index] = numbers[high] 
        numbers[high] = temp

        return store_index

    @staticmethod
    def sort_helper(numbers, low, high):
        if low < high:
            part_index = QuickSort.partition(numbers, low, high)
            QuickSort.sort_helper(numbers, low, part_index)
            QuickSort.sort_helper(numbers, part_index+1, high)
        return numbers  

    @classmethod
    def sort(cls, numbers):
        assert len(numbers) != 0, "Must provide a non-empty list"
        return QuickSort.sort_helper(numbers, 0, len(numbers)-1)

I can run tests in this module as follows for:

 if __name__ == '__main__':
    seq = random.sample(xrange(1,100000), 10000)
    print 'Original Seq: ', seq
    real = sorted(seq)
    quick = QuickSort.sort(seq)
    print "Real Sorted: ", real
    print 'Quick Sorted: ', quick

In my unittest module, I run the following as a test:

def test_really_large(self):
    """Tests handling of a very large sequence."""
    test_list = random.sample(xrange(1,10000), 5000)
    real_sorted = sorted(test_list)
    insert_sorted = InsertionSort.sort(test_list)
    merge_sorted = MergeSort.sort(test_list)
    quick_sorted = QuickSort.sort(test_list)

    self.assertEqual(real_sorted, insert_sorted)
    self.assertEqual(real_sorted, merge_sorted)
    self.assertEqual(real_sorted, quick_sorted)

What confuses me is that an error is only thrown when I run the unittest module. Here I get an error when I try to sort 5000 integers. However when I run a single quicksort test in the other module, I can sort beyond 20000 integers without an error being thrown. Why is this happening?


Thanks to everyone for the quick responses. I think everyone is correct in pointing out that my shoddy way of choosing the pivot is what's creating a problem with already sorted lists. I will modify this appropriately.

4
  • 4
    Why are you using a class with only static methods (the class method ignores the cls argument). Python doesn't require that you use classes for everything like Java does. Commented Dec 21, 2014 at 9:28
  • Your QuickSort works in-place. Is there a chance that InsertionSort and/or MergeSort also work in-place, and somehow break the list? Commented Dec 21, 2014 at 9:48
  • @MartijnPieters: It was a stylistic choice. Would another way have been better? Commented Dec 21, 2014 at 9:51
  • To swap numbers in python, you don't need a temp variable: numbers[store_index], numbers[index] = numbers[index], numbers[store_index] Commented Dec 21, 2014 at 9:59

2 Answers 2

3

Your quicksort algorithm is very bad at sorting sorted lists. Your pivot_val is the first number, so a sorted list is partitioned in one number and the rest. For random lists, the recursion needed is something like log_2 N, that means for 20000 elements the depth is about 15. In your sorted case, the recursion depth of 5000 elements is 5000.

Sign up to request clarification or add additional context in comments.

Comments

2

Your recursion fails to terminate when the list is already sorted.

The reason why you're seeing it fail only in the unit test is that, as Rawing pointed out, all your methods modify the list in place: so test_list is already sorted after the call to InsertionSort.

You should test each sorter in a separate unit test method, recreating the test data each time.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.