0

I am working on to replace all digits in my data source with "numbr". Some examples are

  1. 1234-546-234235-1232-1242-123124 -> numbr
  2. 125436 -> numbr
  3. abc1231241 -> abcnumbr

I have tried using re.sub(r'(\d+[/-]*\d+)(R?)', "numbr", token) but it is not doing replacement for example 1 properly. Any ideas of what I am missing ?

0

1 Answer 1

5

Code

See regex in use here

(?:\d-\d|\d)+

Another alternative (?:\d(?:-\d)?)+ can be used, but it takes 1 extra step to complete.


Results

Input

1234-546-234235-1232-1242-123124
125436
abc1231241

Output

numbr
numbr
abcnumbr

Explanation

  • (?:\d-\d|\d)+ Match either of the following one or more times
    • \d-\d Match a digit, followed by a hyphen -, followed by a digit
    • \d Match a digit

The reason to use (?:\d-\d|\d)+ instead of [\d-]+ is so that we don't accidentally replace valid hyphenated words such that something like my-name becomes mynumbrname or abc-1234 doesn't become abcnumbr, but instead abc-numbr

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.