0

In python, trying to replace all occurrence of a string found using regex such as:

'10am 11pm 13am 14pm 4am'

becomes

'10 am 11 pm 13 am 14 pm 4 am'

I tried

re.sub('([0-9].*)am(.*)', r'\1 am \2', ddata) 

But this only replaces the last occurrence.

and

import re
regex = re.compile('([0-9].*)am+', re.S)
myfile =  '10am 11pm 13am 14pm 4am'
myfile2 = regex.sub(lambda m: m.group().replace(r'am',r" am ",1), myfile)
print(myfile2)

only replaces the first occurence of 'am'

Expected results to me '10 am 11pm 13 am 14pm 4 am'

6
  • 1
    (\d{1,2})(?=[ap]m) replace with \1 (see here) or (\d{1,2})([ap]m) replace with \1 \2(see here) Commented Apr 12, 2019 at 19:47
  • I think I was not clear in my I was using reg ex in this case. Imagine the sentence: "the amphitheater opens at 10am-11am and 3pm-7pm" - we want to make sure NOT to replace 'am' in amphitheater. Commented Apr 13, 2019 at 5:50
  • 1
    The real question is do you really want to change that sentence/example? Given the conditions you set you CAN use this, but it's going to be ugly. >>> re.sub(r'(?<=\d)([ap]m)', r' \1', 'the amphitheater opens at 10am-11am and 3pm-7pm')... #OUTPUT: 'the amphitheater opens at 10 am-11 am and 3 pm-7 pm' Commented Apr 13, 2019 at 6:06
  • 1
    @FailSafe came to the same conclusion. positive lookbehind works but sentence looks ugly. does the OP want something like 10 am - 11 am and 3 pm - 7 pm? now that is another question altogether from the original post. :) Commented Apr 13, 2019 at 6:17
  • @FailSafe this sentence transformation is NOT meant for human consumption so yes I really do want to do this. Commented Apr 13, 2019 at 6:19

4 Answers 4

1

Use capture groups for both the digits and the "am" or "pm" string and then just substitute with a space between the groups.

import re

s = '10am 11pm 13am 14pm 4am'

subbed = re.sub(r'(\d+)([ap]m)', r'\1 \2', s)
print(subbed)
# 10 am 11 pm 13 am 14 pm 4 am
Sign up to request clarification or add additional context in comments.

Comments

0

This will do the work:

import re
myfile =  '10am 11pm 13am 14pm 4am'
re.sub(r'(\d+)(am|pm)', r'\1 \2', myfile)

Here is the test output:

>>> import re
>>> myfile =  '10am 11pm 13am 14pm 4am'
>>> re.sub(r'(\d+)(am|pm)', r'\1 \2', myfile)
'10 am 11 pm 13 am 14 pm 4 am'
>>> 

EDIT: Here is the output of the same solution dealing with the string you posted in the comments:

>>> import re
>>> myfile = 'The amphitheater opens at 10am-11am and 3pm-7pm'
>>> re.sub(r'(\d+)(am|pm)', r'\1 \2', myfile)
'The amphitheater opens at 10 am-11 am and 3 pm-7 pm'
>>> 

2 Comments

I think I was not clear in my I was using reg ex in this case. Imagine the sentence: "the amphitheater opens at 10am-11am and 3pm-7pm" - we want to make sure NOT to replace 'am' in amphitheater.
@jvence, did you check my answer? It address that without any problem since I'm matching numbers followed by am or pm, without spaces.
0

If you really wanted a solution using regex instead of a plain string replace method as mentioned above, you could use the below snippet.

import re
myfile = '10am 11pm 13am 14pm 4am'
myfile2 = re.sub(r'(\d+)(am)', lambda m: '{} {}'.format(*m.groups()), myfile, 0)
print(myfile2)

3 Comments

Why introduce lambda and str.format when you are already using re.sub?
@accdias that is needed since you need to know the digit and the am/pm info. This solution is flexible to handle both am and pm info. My initial snippet had the second part of regex as (am|pm) which was later edited to include only am since thats what the OP asked for. Hope that answers your question.
I see what you are referring to here, instead of lambda, you can directly use back references like r'\1 \2'
0

You could do this without using re:

'10am 11pm 13am 14pm 4am'.replace('a',' a').replace('p',' p')  

## Output: '10 am 11 pm 13 am 14 pm 4 am'

4 Comments

Thank you for not using a solution that's more complicated than needed. Hate to say that I wonder if this question will get -1'ed? Anyway, I'm gonna post this under yours if he wants regex because a full answer isn't needed at all here. >>> re.sub(r'(a|p)', r' \1', '10am 11pm 13am 14pm 4am') ................ #OUTPUT: '10 am 11 pm 13 am 14 pm 4 am'
@FailSafe Thanks and your regex pattern is the most concise and apt among the others on this page. Hopefully the OP takes notice of it.
@FailSafe See comment added above about the sentence: "the amphitheater opens at 10am-11am and 3pm-7pm"
This will have collateral effects with strings out of that pattern.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.