2

Problem: I have a list of strings and I need to get rid of whitespaces before and after substring that looks like 'digit / digit'. Been stuck on this for quite a while and still don't understand how to fix itI will appreciate any help.

Sample input:

steps = [
'mix butter , flour , 1 / 3 c',
'sugar and 1-1 / 4 t',
'vanilla'
]

Expected output:

[
'mixbutter,flour,1 / 3c',
'sugarand1-1 / 4t',
'vanilla'
]

My approach:

steps_new = []
for step in steps:
    step = re.sub(r'\s+[^\d+\s/\s\d+]','',step)
    steps_new.append(step)
steps_new

My output:

[
'mixutterlour 1 / 3',
'sugarnd 1-1 / 4',
'vanilla'
]
1
  • What is your desired output? Commented Oct 19, 2022 at 19:58

3 Answers 3

1

You can use

import re
steps = ['mix butter , flour , 1 / 3 c', 'sugar and 1-1 / 4 t', 'vanilla']
steps_new = [re.sub(r'(\d+\s*/\s*\d+)|\s+', lambda x: x.group(1) or "", x) for x in steps]
print(steps_new) # => ['mixbutter,flour,1 / 3c', 'sugarand1-1 / 4t', 'vanilla']

See the Python demo online.

The (\d+\s*/\s*\d+)|\s+ regex matches and captures into Group 1 sequences of digits + zero or more whitespaces + / + zero or more whitespaces + digits (with (\d+\s*/\s*\d+)), or (|) just matches one or more whitespaces (\s+).

If Group 1 participated in the match, the replacement is an empty string. Else, the replacement is the Group 1 value, i.e. no replacement occurs.

Sign up to request clarification or add additional context in comments.

Comments

1

You can remove all spaces and then insert spaces to correct places (\d)/(\d):

import re

steps = ["mix butter , flour , 1 / 3 c", "sugar and 1-1 / 4 t", "vanilla"]

for step in steps:
    x = re.sub(r"(\d)/(\d)", r"\1 / \2", step.replace(" ", ""))
    print(x)

Prints:

mixbutter,flour,1 / 3c
sugarand1-1 / 4t
vanilla

1 Comment

Just FYI: here, you assume there are always a single space on both ends of a slash enclosed with digits.
1

You may use this lookaround based solution to get this in just a single regex:

(?<!/)[ \t](?![ \t]*/)

RegEx Demo

RegEx Details:

  • (?<!/): Assert that previous character is not /
  • [ \t]: Match a space or tab
  • (?![ \t]*/): Assert that next position doesn't have a / after 0 or more spaces

Code:

import re
 
arr = ["mix butter , flour , 1 / 3 c", "sugar and 1-1 / 4 t", "vanilla"]
 
rx = re.compile(r'(?<!/)[ \t](?![ \t]*/)')
 
for i in arr:
    print (rx.sub('', i))

Code Demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.