0

Is there a way to combine multiple regex statements into one so it can do different subs on a single pass?

no_Punct = re.sub('(\w)([?:!.,;-]+)(\s)',r'\1 ',raw)
no_Punct = re.sub('(\s)([-]+)(\s)',r'\1',no_Punct)

The input string is 'raw'. I am trying to strip certain punctuation at the ends of words and remove hyphens that are surrounded by a space on each side. Can I combine both of these into one statement?

Given the input of: This is a sentence! One-fourth equals .25.

Output is: This is a sentence one fourth equals .25

2
  • 1
    can you show some sample input and expected outputs? Commented Oct 31, 2012 at 23:13
  • Sample input/output is added. Commented Nov 6, 2012 at 21:17

1 Answer 1

5

Trivially, by just substituting one into the other:

no_Punct = re.sub('(\s)([-]+)(\s)', r'\1', re.sub('(\w)([?:!.,;-]+)(\s)', r'\1 ', raw))

Although this may also work:

no_Punct = re.sub('(?<=\w)[?:!.,;-]+(?=\s)|(?<=\s)-+\s', '', raw)
Sign up to request clarification or add additional context in comments.

1 Comment

For the first method you posted, wouldn't that still make two passes? One pass is made to eliminate all punctuation at the end of a word/sentence and then that output is fed into the next sub. That still looks like 2 passes through the whole string. Is that correct?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.