1

I am a very beginner of programming and reading the book "Automate the boring stuff with Python'. In Chapter 7, there is a project practice: the regex version of strip(). My code below does not work (I use Python 3.6.1). Could anyone help?

import re

string = input("Enter a string to strip: ")
strip_chars = input("Enter the characters you want to be stripped: ")

def strip_fn(string, strip_chars):
    if strip_chars == '':
        blank_start_end_regex = re.compile(r'^(\s)+|(\s)+$')
        stripped_string = blank_start_end_regex.sub('', string)
        print(stripped_string)
    else:
        strip_chars_start_end_regex = re.compile(r'^(strip_chars)*|(strip_chars)*$')
        stripped_string = strip_chars_start_end_regex.sub('', string)
        print(stripped_string)
2
  • 1
    Does not work how, and for what input? Commented Sep 21, 2017 at 13:02
  • 1
    r'^(strip_chars)*|(strip_chars)*$' -> r'^[{0}]+|[{0}]+$'.format("".join([re.escape(x) for x in strip_chars])). Also, remove the unnecessary ( and ) in r'^(\s)+|(\s)+$'. Commented Sep 21, 2017 at 13:02

3 Answers 3

2

You can also use re.sub to substitute the characters in the start or end. Let us say if the char is 'x'

re.sub(r'^x+', "", string)
re.sub(r'x+$', "", string)

The first line as lstrip and the second as rstrip This just looks simpler.

Sign up to request clarification or add additional context in comments.

Comments

0

When using r'^(strip_chars)*|(strip_chars)*$' string literal, the strip_chars is not interpolated, i.e. it is treated as a part of the string. You need to pass it as a variable to the regex. However, just passing it in the current form would result in a "corrupt" regex because (...) in a regex is a grouping construct, while you want to match a single char from the define set of chars stored in the strip_chars variable.

You could just wrap the string with a pair of [ and ] to create a character class, but if the variable contains, say z-a, it would make the resulting pattern invalid. You also need to escape each char to play it safe.

Replace

r'^(strip_chars)*|(strip_chars)*$'

with

r'^[{0}]+|[{0}]+$'.format("".join([re.escape(x) for x in strip_chars]))

I advise to replace * (zero or more occurrences) with + (one or more occurrences) quantifier because in most cases, when we want to remove something, we need to match at least 1 occurrence of the unnecessary string(s).

Also, you may replace r'^(\s)+|(\s)+$' with r'^\s+|\s+$' since the repeated capturing groups will keep on re-writing group values upon each iteration slightly hampering the regex execution.

Comments

0
#! python
# Regex Version of Strip()
import re
def RegexStrip(mainString,charsToBeRemoved=None):
    if(charsToBeRemoved!=None):
        regex=re.compile(r'[%s]'%charsToBeRemoved)#Interesting TO NOTE
        return regex.sub('',mainString)
    else:
        regex=re.compile(r'^\s+')
        regex1=re.compile(r'$\s+')
        newString=regex1.sub('',mainString)
        newString=regex.sub('',newString)
        return newString

Str='   hello3123my43name is antony    '
print(RegexStrip(Str))

Maybe this could help, it can be further simplified of course.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.