0

I'm trying to create a regex that will take a string and replace certain characters

  1. Double or more spaces reduces to one space
  2. The following chars will be replaced by a word: "#" -> "number, "@" -> "at"
  3. Spaces will be replaced with "-", unless its at the end of the string
  4. Contains only a-z, A-Z, 0-9 and: !@#$%&/,
  5. Double or more "-" will reduce to one
"Hello, Wor--ld! 1$2@3-   " -> "hello-wor-ld-1-dollars-2-at-3"

My code:

name = "Hello, World! 1$2@3-   "

name = re.sub("[^a-zA-Z0-9]+","-",name.lower())

print(name)

But it results in "hello-world-1-2-3-"

3
  • it becomes lowercase too ? Can you edit example with a # to ensure what it becomes ? Commented Jul 4, 2021 at 16:23
  • 2
    Your example doesn't follow your rules at all. What about "dollars" , the comma, the "-" between the elements at the end ... I was ready to answer, but your question is very incomplete regarding your example Commented Jul 4, 2021 at 16:25
  • It seems the !@#$%& should all correspond to some words. What is the full list? "*# -> number, @ -> at, $ -> dollars", and the rest? Please fix your test case. Commented Jul 4, 2021 at 16:28

1 Answer 1

1

Here is the code that you may use as a basis to solve your issue:

import re
name = "Hello, World! 1$2@3-   "
name = re.sub("[^a-zA-Z0-9@#$&]+", "-", " ".join(name.lower().split()))
dct = {'#': 'number', '@': 'at', '$': 'dollars', '&': 'and'}
name = re.sub(r'[$@#]', lambda x: f"-{dct[x.group()]}-", name)
print(name.strip('-'))
# => hello-world-1-dollars-2-at-3

See the Python demo.

Notes:

  • " ".join(name.lower().split()) - removes leading/trailing whitespaces, shrinks multiple whitespaces to a single occurrence between words and splits with whitespace
  • re.sub("[^a-zA-Z0-9@#$&]+", "-", ...) - replaces all one or more consecutive chars other than alphanumeric, #, @, $ and & chars with a hyphen
  • re.sub(r'[$@#]', lambda x: f"-{dct[x.group()]}-", name) - replaces specified special chars with words
  • name.strip('-') removes leading/trailing hyphens.
Sign up to request clarification or add additional context in comments.

5 Comments

How can I check for a pattern for example when having "$2" the output would be "2-dollars" instead of "dollars-2"? the combination of $ before a number
When I put for example "Fries for $2" it returns "fries-for2-dollars" instead of "fries-for-2-dollars" or for "Hello, World! 1$2@3- " returns "hello-world-12-dollars-at-3" instead of "hello-world-1-dollars-2-at-3"
How will I go about making it return for "Fries for $2.99@" -> "fries-for-2-dollars-99-cents-at"?
If rearranged to: name = re.sub(r'[$@#]', lambda x: f"-{dct[x.group()]}-", name) name = re.sub(r"-*(\$)(\d+)-*", r"\2\1", name) out put would be "hello-world-1-dollars-2-at-3" but "fries-for-dollars-2" instead of "fries-for-2-dollars"
@Adele Please check this Python demo. Your expectations start to diverge, I think 1$2 means 1-2-dollars and not 1-dollars-2. Else, the format is irregular and cannot be solved with regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.