2

I was experimenting with python sorting using key. I'm interested in the inner working of the algorithm. Is it roughly equivalent to Schwartzian transform (Decorate-Sort-Undecorate)?

Specifically:

  • The spec keys are extracted only once. Can it be asssumed that this happens before any comparisions occur?
  • How are the extracted keys maintained in memory? As tuples (key, orginal_value) or does it use some other method.

I used the following test program

class Isbn:
    def __init__(self, isbn_num):
        self.isbn_num = isbn_num

    def __lt__(self, other):
        print(f"__lt__ {self.isbn_num} {other.isbn_num}")
        return self.isbn_num < other.isbn_num

    def __repr__(self) -> str:
        return f'Isbn({self.isbn_num})'


class Book:
    def __init__(self, isbn):
        self.isbn = Isbn(isbn)

    def __repr__(self) -> str:
        return f'Book({self.isbn})'

    @property
    def key(self):
        print(f"key {self.isbn}")
        return self.isbn


books = [Book(5), Book(10), Book(6), Book(2)]
books.sort(key=lambda b: b.key)
print(books)

Which gives the following output:

key Isbn(5)
key Isbn(10)
key Isbn(6)
key Isbn(2)
__lt__ 10 5
__lt__ 6 10
__lt__ 6 10
__lt__ 6 5
__lt__ 2 6
__lt__ 2 5
[Book(Isbn(2)), Book(Isbn(5)), Book(Isbn(6)), Book(Isbn(10))]
1
  • That's a nice experiment you've performed! Commented May 17, 2019 at 8:33

2 Answers 2

1

Talking specifically about CPython (other implementations of Python are available):

It does do the transform. It currently builds a C-array of keys before it starts to sort. This is done entirely in C - so it's not a Python list. No Python tuples are involved.

This is an excerpt from the (current) relevant C code (though of course this will change as CPython evolves), taken from listobject.c.

key_func is the key function. saved_ob_size is the length of the list. saved_ob_item is the array from the original list.

2239 if (keyfunc == NULL) { 
         ...
2243     } 
2244     else { 
         ...    
2256         for (i = 0; i < saved_ob_size ; i++) { 
2257             keys[i] = PyObject_CallFunctionObjArgs(keyfunc, saved_ob_item[i], 
2258                                                    NULL); 
                 ...
2265             } 
2266         } 
Sign up to request clarification or add additional context in comments.

1 Comment

Nice! Thanks to your answer, I was able to find the actual issue: bugs.python.org/issue9915 and changset: github.com/python/cpython/commit/…
1

Yes Python does use the Schwartzian transform in some cases. From this documentation.

Python programmers use the transform in sorts where the comparison operation may be expensive.

3 Comments

But does the sort use a Schwartzian transform?
This is not the point, I crafted my example to have a hook into __lt__. Your answer does not address any of my questions.
Yes, my bad, I got a link where it tells about Python sorting and the transform, let me look more into this!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.