Python - Using regex to write a function imitating the strip() method

Question

I am doing a problem from Automate the Boring Stuff, trying to imitate the strip() method using regex. I have pretty much figured it out, works with whitespace and a specific word I want removed. But when removing a specific keyword from the end of a string, it always cuts the last letter of the string off, can anyone help me figure out why?

def strip_func(string, *args):
strip_regex = re.compile(r'^(\s+)(.*?)(\s+)$')
mo = strip_regex.findall(string)
if not mo:
    rem = args[0]
    remove_regex = re.compile(rf'({rem})+(.*)[^{rem}]')
    remove_mo = remove_regex.findall(string)
    print(remove_mo[0][1])

else:
    print(mo[0][1])

So if no second argument is passed then the function deletes whitespace from either side of the string, I used this string to test that:

s = '        This is a string with whitespace on either side        '

Otherwise it deletes the keyword, kind of like the strip function. Eg:

spam = 'SpamSpamBaconSpamEggsSpamSpam'
strip_func(spam, 'Spam')

Output:

BaconSpamEgg

So missing the 's' at the end of Eggs, same thing happens with every string I try. Thanks in advance for the help.

rf'({rem})+(.*)[^{rem}]' is just wrong, you cannot negate a sequence of chars with a negated character class. Use rf'({rem})+(.*?)(?={rem}|$)' — Wiktor Stribiżew
– Wiktor Stribiżew, Commented May 7, 2020 at 12:21
I suspect all you need is def strip_func(string, *args): return re.sub(rf'^(?:{re.escape(args[0])})+(.*?)(?:{re.escape(args[0])})+$', r'\1', string, flags=re.S) . See ideone.com/6jHH68 — Wiktor Stribiżew
– Wiktor Stribiżew, Commented May 7, 2020 at 12:27
Am I right all you need is to remove consecutive multicharacter sequences at the start and end of string? — Wiktor Stribiżew
– Wiktor Stribiżew, Commented May 7, 2020 at 12:52
Thanks! I thought the negated character class sequence was weird but it was the closest I got, so just went with it. Ya I want to remove the consecutive sequences from the start and end. Your update works well, so thanks again. — Arcadia_Lake
– Arcadia_Lake, Commented May 7, 2020 at 13:47

Wiktor Stribiżew · Accepted Answer · 2020-05-07 13:48:27Z

2

You may use

import re

def strip_func(string, *args):
  return re.sub(rf'^(?:{re.escape(args[0])})+(.*?)(?:{re.escape(args[0])})+$', r'\1', string, flags=re.S)

spam = 'SpamSpamBaconSpamEggsSpamSpam'
print(strip_func(spam, 'Spam'))

See the Python demo. The ^(?:{re.escape(args[0])})+(.*?)(?:{re.escape(args[0])})+$ pattern will create a pattern like ^(?:Spam)+(.*?)(?:Spam)+$ and will match

^ - start of string
(?:Spam)+ - one or more occurrences of Spam at the start of the string
(.*?) - Group 1: any 0 or more chars as few as possible
(?:Spam)+ - one or more occurrences of Spam at the start of the string
$ - end of string.

The flags=re.S will make . match line break chars, too.

answered May 7, 2020 at 13:48

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python - Using regex to write a function imitating the strip() method

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related