I am trying to think of a more elegant way of replacing multiple patterns in a given string using re in relation to a little problem, which is to remove from a given string all substrings consisting of more than two spaces and also all substrings where a letter starts after a period without any space. So the sentence
'This is a strange sentence. There are too many spaces.And.Some periods are not. placed properly.'
should be corrected to:
'This is a strange sentence. There are too many spaces. And. Some periods are not. placed properly.'
My solution, below, seems a bit messy. I was wondering whether there was a nicer way of doing this, as in a one-liner regex.
def correct( astring ):
import re
bstring = re.sub( r' +', ' ', astring )
letters = [frag.strip( '.' ) for frag in re.findall( r'\.\w', bstring )]
for letter in letters:
bstring = re.sub( r'\.{}'.format( letter ), '. {}'.format( letter ), bstring )
return bstring