4

I am learning python and am having trouble with sorting. I feel like the key (for sorting) is too limiting and difficult to use once sorting algorithm begins getting more complicated. Here is the list I want to sort:

['A1', 'AA1', 'B3', 'B2', 'BB1', 'AZ15']

where each value is like an excel column (ie. 'BB1' > 'AZ15' > 'AA1' > 'B3' > 'B2' > 'A1').

Here is the solution I came up with after reading the following guide.

def cmp_cell_ids(name1, name2):
    def split(name):
        letter = ''
        number = ''
        for ch in name:
            if ch in '1234567890':
                number += ch
            else:
                letter += ch
        return letter, int(number)
    ltr1, num1 = split(name1)
    ltr2, num2 = split(name2)
    if len(ltr1) == len(ltr2):
        if ltr1 == ltr2:
            return num1 > num2
        else:
            return ltr1 > ltr2
    return len(ltr1) > len(ltr2)

def cmp_to_key(mycmp):
    class K:
        def __init__(self, obj, *args):
            self.obj = obj
        def __lt__(self, other):
            return not mycmp(self.obj, other.obj)
        def __gt__(self, other):
            return mycmp(self.obj, other.obj)
        def __eq__(self, other):
            return self.obj == other.obj
        def __le__(self, other):
            if self.__eq__(other):
                return True
            return self.__lt__(other)
        def __ge__(self, other):
            if self.__eq__(other):
                return True
            return self.__gt__(other)
        def __ne__(self, other):
            return self.obj != other.obj
    return K

key_cell_ids_cmp = cmp_to_key(cmp_cell_ids)
cell_ids = ['A1','AA1','B3','B2','BB1','AZ15']
cell_ids.sort(key=key_cell_ids_cmp)
print(cell_ids)

Which gives the desired output

['A1', 'B2', 'B3', 'AA1', 'AZ15', 'BB1']

I am wondering if there is any easier/more pythonic implementation to this (in particular, I would love if I could get rid of that wrapper class).

2 Answers 2

4

First of all, writing (or copy-pasting) a cmp_to_key function is unnecessary. Just use the one in itertools.

In this case, though, it would be a lot more natural to use a key! Just split each element into a tuple of row name length (so B is before AA), a string row, and an integer column, and rely on the natural lexicographic ordering of tuples.

Viz:

import re

def cell_key(cell):
    m = re.match("([A-Z]+)(\\d+)", cell)
    return (len(m.group(1)), m.group(1), int(m.group(2)))

cells = ['A1', 'AA1', 'B3', 'B2', 'BB1', 'AZ15']

print(sorted(cells, key=cell_key))
Sign up to request clarification or add additional context in comments.

13 Comments

What kind of a Python-3 version you're using? because not the m[1] which is indexing a match object, nor sorting heterogeneous data is possible in Python-3.
@Kasramvd 3.6, since you asked, which was when Match got a __getitem__ method. As for "sorting heterogeneous data", I'm not sure what you mean. All keys are int,str,int tuples, being compared in the normal way.
@Kasramvd If you'd like to see how tuple comparison works, try doing ('A', 2) < ('B', 1). (That should work in any version of python, not just 3.6 or later.)
Instead of (len(m[1]), m[1], int(m[2])), you could also use (int(m[1], 36), int(m[2])), i.e. treat the first part as a number base 36 (digits 0-9 are not used, but that does not hurt sorting).
@Kasramvd Yeah, as I said you'll need python 3.6 for the getitem method. Or just change m[1] to m.group(1) and similarly for m[2].
|
3

Very similar solution to @Sneftel's, but I approached the problem by finding the index of the first numeric character.

import re

A = ['A1', 'AA1', 'B3', 'B2', 'BB1', 'AZ15']

def sorter(x):
    n = re.search('\d', x).start()
    return (len(x[:n]), x[:n], int(x[n:]))

res = sorted(A, key=sorter)

print(res)

['A1', 'B2', 'B3', 'AA1', 'AZ15', 'BB1']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.