1

I have a list of words:

words = ['miel', 'extraterrestre', 'al', 'automovil', 'auto', 'revestir']

I want to sort this list using custom alphabet (it contains 26 letters of the alphabet but ordered in a different way):

g = 'zyxwvutsrqponmlkjihgfedcba'

Expected result:

['revestir', 'miel', 'extraterrestre', 'auto', 'automovil', 'al']
4

4 Answers 4

2

If speed is important and your list of words is even moderately long, you are probably better off making a lookup rather than searching for the index of each character for each word.

You can use the string function translate() and maketrans() to create a fast way of converting the input strings to a translation for sorting.

For example:

# make translation table 
trans = str.maketrans(g, "".join(sorted(g))) 

# translate words works like:
"revestir".translate(trans)   # 'ivevhgri'

# sort with it:
sorted(l, key=lambda word: word.translate(trans))

# ['revestir', 'miel', 'extraterrestre', 'auto', 'automovil', 'al']

This also has the benefit of being resilient to errors if there are characters in your strings that are not in your alphabet, which index() will choke on. They just get passed through, like:

"reve*stir".translate(trans)
# 'ivev*hgri'
Sign up to request clarification or add additional context in comments.

Comments

1

The more efficient solution, instead of indexing for every letter would be to build a dictionary beforehand and use the get function which is O(1) for every letter in the word.

pos_map = {}
for idx, letter in enumerate(g):
    pos_map[letter] = idx

def key(item):
    return [pos_map.get(c) for c in item]

sorted(words, key=key)
['revestir', 'miel', 'extraterrestre', 'auto', 'automovil', 'al']

3 Comments

Is it more efficient than the other solution that also does this and more efficient than the translate solution?
more efficient than indexing, which I explicitly state
If speed is critical, I think translate will be faster than the dictionary lookups by a fair margin. But translate and maketrans are a little difficult to understand, so maybe not worth it.
1

You can use sorted() and pass as a key lambda function which converts your string into a list of indexes of each char of this string in your custom alphabet:

words = ['miel', 'extraterrestre', 'al', 'automovil', 'auto', 'revestir']
g = 'zyxwvutsrqponmlkjihgfedcba'
sorted_words = sorted(words, key=lambda w: [g.index(c) for c in w])

P.S. This is simplest but definitely not fastest solution, so if your input list is quite big you should consider using solutions from other answers.

5 Comments

g.index(c) is very slow for large inputs
@MohammedKhalid, it wasn't my goal to write fastest solution.
@MohammedKhalid How much "very slow"? And "large" in what way?
@HeapOverflow, I think, few hundred thousands of strings would be enough to feel the difference.
@OlvinRoght See a difference? Sure. But "very slow"? I just tested with 500,000 random strings of length 7. Your solution took 2.15 seconds, Mohammed's took 1.43 seconds. If that's enough to call a solution "very slow", then Mohammed's is also very slow, as Mark's only took 0.64 seconds.
0
g = 'zyxwvutsrqponmlkjihgfedcba'
order={g[i]:i for i in range(len(g))}
sorted(words, key= lambda word:tuple([order[w] for w in word ]))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.