0

I have a problem with my sorting method.

This is my object contained in list: Address class, with city attribute

My list looklike (simplified) :

[Address('Paris'), Address('Denver'), Address('Paris'), Address('Test'), Address('Denver')]

For this example, i have two duplicates cities: Paris and Denver,

I want to have a result like:

[Address('Devenr'), Address('Denver'), Address('Paris'), Address('Paris'), Address('Test')]

Sorted by duplicates count, and in case of same number, by alphanumeric order.

I tried:

self.dictionnary.sort(key=lambda address: len([x for x in self.dictionnary if address.city == x.city]))

By this don't work...

Can anyone help me?

Thank you in advance !

2 Answers 2

1
import collections
counts = collections.Counter(address.city for address in self.dictionnary)
self.dictionnary.sort(key=lambda address: (-counts[address.city], address.city))

By using Counter to count the duplicates in a separate step, you save the overhead of scanning the list each time you need a new key. This can make a big difference in the run time for a long list. The key then becomes a tuple; by taking the negative of the count, the larger counts will come first in the sort order. The second part of the tuple, the city name itself, will only be considered when the counts are equal.

Sign up to request clarification or add additional context in comments.

3 Comments

Could the downvoter please explain themselves? I just checked this code and it works perfectly. Plus it's more efficient than counting the entire list every time you need a new key.
I can see why someone might downvote this: it's a code-only answer. In general, including an explanation really helps to improve the quality of your post.
@vaultah for me that's worth a comment, not a downvote. I reserve my downvotes for things that are actually incorrect.
1

The problem is that paris and denver both have count 2 so they don't get sorted.

If you add the string to the sort so ties are broken lexically it should work

Example:

from collections import Counter

l = ['a', 'b', 'a', 'b', 'c']
c = Counter(l)
l.sort(key=lambda x : -c[x])
# l is unchanged
l.sort(key=lambda x : (-c[x],x))
# l is ['a', 'a', 'b', 'b', 'c']

edit: Mark's solution uses a counter which is much better than recounting every time. I am going to steal that Idea

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.