0

I have a pattern: "two_or_more_characters - zero_or_more_characters" and I want to replace it with "two_or_more_characters", where "-" is a dash.

I created regex for it:

re.sub(r'-[\w(){}\[\],.?! ]+', '', t)

and it works as expected for some cases. For example for t = "red-fox" we will get red. But it does not work as needed for example: t = "r-fox". The result is r but I am looking for way to keep r-fox instead.

If text has more then one dash then we need to remove text only after last dash. For example for t = "r-fox-dog" the result should be r-fox

4
  • 'r ' is 'r' <space>, which IS 2 or more characters, so in my opinion the replacement is correct here. Perhaps, give us a better explanation of what you want. Do you want to ignore spaces before dash? Commented Dec 12, 2022 at 7:50
  • What if your text has more dashes? Like "todo-nodo-undo"? Commented Dec 12, 2022 at 7:54
  • @PetrBlahos I edited the body of Q to address your questions Thanks. Commented Dec 12, 2022 at 7:58
  • @zvone: True enough, I would recommend using rpartition though. But regex will automatically solve the "at least 2 characters" task. Commented Dec 12, 2022 at 7:59

1 Answer 1

2

Use a backref - that's the thing in the () in the regular expression, and \1 to "paste" it. I think this works well enough:

re.sub(r'(.{2,})-.*', r'\1', "ss-fox")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.