5

I have a script that loops over several search/replace regex in python, one of those operations is remove trailing spaces I've tried:

re.sub(r"""\s+$""", '', str)

re.sub(r""" +$""", r"""""", str)

and

re.sub(r""" +$""", r"""""", str, re.M)

I found several answers that simply recommended using strip my problem is that I want to integrate this in the regex replace mechanism.

10
  • 4
    Why all the triple-quotes? r"\s+$" would work fine. Commented Jun 27, 2013 at 18:37
  • 3
    Why do you want to integrate this into the regex replace mechanism? There is no obvious reason to do so. Commented Jun 27, 2013 at 18:39
  • @rmunn it's just to enforce that normal expression does not work, yours included, SlaterTyranus because I already have it and I don't want to move on line by line in a separate loop just to do so, and Commented Jun 27, 2013 at 18:45
  • @MysticOdin are you assigning the result of sub back to str? Otherwise str is never gonna change. Commented Jun 27, 2013 at 18:49
  • @m.buettner yes, the script works on all regexes in the dictionary except this one, I't actually in the middle of the dictionaries and entries before it and after had succeeded Commented Jun 27, 2013 at 18:55

2 Answers 2

10

The function is sub and takes the target string as an argument (and returns a modified copy):

str = re.sub(r'\s+$', '', str)

or if you want to remove trailing spaces from multiple lines in a single string, use one of these:

str = re.sub(r'\s+$', '', str, 0, re.M)
str = re.sub(r'\s+$', '', str, flags=re.M)

The 0 is the count parameter (where 0 means no limit) and then re.M makes $ match at line endings. If you don't specify flags explicitly, you need that additional parameter, because flags is actually the fifth one.

Note that you only need triple quotes for multiline strings. What's important is the r for the pattern.

Alternatively, rstrip is used to remove trailing whitespace:

str = str.rstrip()
Sign up to request clarification or add additional context in comments.

7 Comments

+1 for rstrip, definitely the right choice here. Any idea on how the performance of these two fare?
@MysticOdin still you need to supply the input string. otherwise, how is re gonna know which string to do the replacement on?
I understand that r strip is the correct choice for the simple remove trailing spaces, the problem is this script runs on files, and not line by line, and does several search/replace from an expression dictionary
the line in the script is something like re.sub(Exp, RepExp, FileDump, re.M) but it's inside a loop and I didn't want to share unnecessary code
This example removes multiple newlines.
|
1

This removes trailing space using regex:

import os
import re
PATH = '/path/to/source'

re_strip = re.compile(r'[ \t]+(\n|\Z)')

for path, dirs, files in os.walk(PATH):
    for f in files:
        file_name, file_extension = os.path.splitext(f)
        if file_extension == '.py':
            path_name = os.path.join(path, f)
            with open(path_name, 'r') as fh:
                data = fh.read()

            data = re_strip.sub(r'\1', data)

            with open(path_name, 'w') as fh:
                fh.write(data)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.