How to remove spaces before pattern and after it in python with regex?

Question

Problem: I have a list of strings and I need to get rid of whitespaces before and after substring that looks like 'digit / digit'. Been stuck on this for quite a while and still don't understand how to fix itI will appreciate any help.

Sample input:

steps = [
'mix butter , flour , 1 / 3 c',
'sugar and 1-1 / 4 t',
'vanilla'
]

Expected output:

[
'mixbutter,flour,1 / 3c',
'sugarand1-1 / 4t',
'vanilla'
]

My approach:

steps_new = []
for step in steps:
    step = re.sub(r'\s+[^\d+\s/\s\d+]','',step)
    steps_new.append(step)
steps_new

My output:

[
'mixutterlour 1 / 3',
'sugarnd 1-1 / 4',
'vanilla'
]

What is your desired output?

Tim Roberts
– Tim Roberts

2022-10-19 19:58:24 +00:00
Commented Oct 19, 2022 at 19:58 — Tim Roberts
– Tim Roberts, Commented Oct 19, 2022 at 19:58

Wiktor Stribiżew · Accepted Answer · 2022-10-19 19:59:09Z

1

You can use

import re
steps = ['mix butter , flour , 1 / 3 c', 'sugar and 1-1 / 4 t', 'vanilla']
steps_new = [re.sub(r'(\d+\s*/\s*\d+)|\s+', lambda x: x.group(1) or "", x) for x in steps]
print(steps_new) # => ['mixbutter,flour,1 / 3c', 'sugarand1-1 / 4t', 'vanilla']

See the Python demo online.

The (\d+\s*/\s*\d+)|\s+ regex matches and captures into Group 1 sequences of digits + zero or more whitespaces + / + zero or more whitespaces + digits (with (\d+\s*/\s*\d+)), or (|) just matches one or more whitespaces (\s+).

If Group 1 participated in the match, the replacement is an empty string. Else, the replacement is the Group 1 value, i.e. no replacement occurs.

answered Oct 19, 2022 at 19:59

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Andrej Kesely · Accepted Answer · 2022-10-19 20:16:00Z

1

You can remove all spaces and then insert spaces to correct places (\d)/(\d):

import re

steps = ["mix butter , flour , 1 / 3 c", "sugar and 1-1 / 4 t", "vanilla"]

for step in steps:
    x = re.sub(r"(\d)/(\d)", r"\1 / \2", step.replace(" ", ""))
    print(x)

Prints:

mixbutter,flour,1 / 3c
sugarand1-1 / 4t
vanilla

edited Oct 19, 2022 at 20:16

answered Oct 19, 2022 at 20:02

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

1 Comment

Wiktor Stribiżew Over a year ago

Just FYI: here, you assume there are always a single space on both ends of a slash enclosed with digits.

anubhava · Accepted Answer · 2022-10-19 20:47:47Z

1

You may use this lookaround based solution to get this in just a single regex:

(?<!/)[ \t](?![ \t]*/)

RegEx Demo

RegEx Details:

(?<!/): Assert that previous character is not /
[ \t]: Match a space or tab
(?![ \t]*/): Assert that next position doesn't have a / after 0 or more spaces

Code:

import re
 
arr = ["mix butter , flour , 1 / 3 c", "sugar and 1-1 / 4 t", "vanilla"]
 
rx = re.compile(r'(?<!/)[ \t](?![ \t]*/)')
 
for i in arr:
    print (rx.sub('', i))

Code Demo

answered Oct 19, 2022 at 20:47

anubhava

790k67 gold badges603 silver badges671 bronze badges

Collectives™ on Stack Overflow

How to remove spaces before pattern and after it in python with regex?

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related