2

I'm trying to sort a list but ignoring a prefix. This question has been answered here. It should be straight forward, only it's not working. Here's what I have:

def sort_it(lst, ignore):

  return lst.sort(key=lambda x: x.strip(ignore))

myList = ["cheesewhiz", "www.cheese.com", "www.wagons.com", "www.apples.com", "www.bananas.com"]

ignoreThis = "www."
sort_it(myList, ignoreThis)
print myList

Only the sorting is getting mixed up as the first item doesn't have anything to ignore as part of the string. I'm not sure if adding a check to see if the string contains the ignore string is the Pythonic approach with Lambda.

I expect the results to be in alphabetic order ignoring the www.

www.apples.com
www.bananas.com
www.cheese.com
cheesewhiz
www.wagons.com
8
  • Change from sort to sorted. The sort function changes the list in place; `sorted leaves the original list alone, and returns a copy, sorted. Commented Nov 20, 2017 at 22:03
  • 1
    Your key function is wrong. str.strip isn't suited to remove a prefix. Use lambda x: x[4:] if x.startswith('www.') else x. Commented Nov 20, 2017 at 22:04
  • @Prune Sure, but that isn't the fundamental problem here. Commented Nov 20, 2017 at 22:04
  • 3
    @SwakeertJain The problem is that "www.wagons.com".strip("www.") returns 'agons.com' Commented Nov 20, 2017 at 22:09
  • 1
    Sorry! my bad. I misunderstood the last statement in the question Commented Nov 20, 2017 at 22:27

1 Answer 1

4

strip doesn't work that way. It will try to strip every single character of the passed argument, so possibly more than the string you passed. Also, you're sorting in-place, no need to return None (or use sorted which will sort a copy of your parameter, maybe less a surprise for callers)

You probably want str.replace instead to get rid of www., or re.sub("^www.","",x)

def sort_it(lst, ignore):
  lst.sort(key=lambda x: x.replace(ignore,""))

myList = ["cheesewhiz", "www.cheese.com", "www.wagons.com", "www.apples.com", "www.bananas.com"]

ignoreThis = "www."
sort_it(myList, ignoreThis)
print(myList)

result:

['www.apples.com', 'www.bananas.com', 'www.cheese.com', 'cheesewhiz', 'www.wagons.com']

More accurately, if you want to remove www. from the key only if it starts with www. you could go with regex (although you'd need to escape the text):

import re

def sort_it(lst, ignore):
  lst.sort(key=lambda x: re.sub("^"+re.escape(ignore),"",x))

or without regex, maybe the best solution, with a ternary expression and startswith since we don't need regular expressions:

def sort_it(lst, ignore):
  lst.sort(key=lambda x: x[len(ignore):] if x.startswith(ignore) else x)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your comprehensive answer. Updated my code in the question to use the ignore parameter.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.