0

I'm trying to escape the strings in this sequence

[0m[ERROR] [1585551547.349979]: Failed to create bragfiles/downtimer/fight100/2020-03-27. Error: 550 Create directory operation failed.
[ERROR] [1585551547.349979]: Failed to create bragfiles/downtimer/fight100/2020-03-27. Error: 550 Create directory operation failed.

and

[32m[INFO] [2020-03-29 23:58:50.607198] TaskManager.poll: system has no current task.[0m
[INFO] [2020-03-29 23:58:50.607198] TaskManager.poll: system has no current task.

Plus the occasional double symbol

"[0m[32m[INFO] [2020-03-29 23:58:34.695268] Polling for updates from the server for fight100...[0m"
"[INFO] [2020-03-29 23:58:34.695268] Polling for updates from the server for fight100..."

I've come across this before but it doesn't seem to be correct in my case:

  1. How can I remove the ANSI escape sequences from a string in python
  2. Remove all ANSI colors/styles from strings

I've been trying various variations of \x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~]) but I don't think that fits the bill

But none of the regexes I've tried so far seem to be generic enough

3
  • Do you want to remove [0m and [32m? text = text.replace('[0m','').replace('[32m','')? Commented Apr 15, 2020 at 15:55
  • More that I want to do it for all color codes in the beginning of lines. Black 0;30 Dark Gray 1;30 Red 0;31 Light Red 1;31 Green 0;32 Light Green 1;32 Brown/Orange 0;33 Yellow 1;33 Blue 0;34 Light Blue 1;34 Purple 0;35 Light Purple 1;35 Cyan 0;36 Light Cyan 1;36 Light Gray 0;37 White 1;37 Commented Apr 15, 2020 at 16:08
  • Is the color escape sequence Always followed by a '[TEXT] sequence? Can the string be reliably split so the color escape sequence will be at the start of the resultant strings? Is the color sequence either at the start of the string OR preceded by a period? Have you considered making multiple passes? Commented Apr 15, 2020 at 17:33

1 Answer 1

1

(One or two (color escape sequences)) followed by (uppercase alpha characters enclosed in square brackets)(positive look ahead)

pat = r'''((\[\d+m){1,2})(?=\[[A-Z]+\])'''

Works with this string:

s = '''[0m[ERROR] [1585551547.349979]: xyz xyz.
[0m[32m[INFO] [2020-03-29 23:58:34.695268] hjk hjk.[0m[32m[INFO] [2020-03-29 23:58:34.695268] foo bar foo'''

The positive lookahead prevents that last bit from being captured.


>>> print(re.sub(pat,'',s))
[ERROR] [1585551547.349979]: xyz xyz.
[INFO] [2020-03-29 23:58:34.695268] hjk hjk.[INFO] [2020-03-29 23:58:34.695268] foo bar foo
>>>

If you need to remove sequences specifying foreground and background colors like

[2m[93m[0m[32m[INFO] [2020-03-29 23:58:34.695268] foo bar foo

use pat = r'''((\[\d+m){1,})(?=\[[A-Z]+\])''' for (one or more) instead of (one or two).


If there is also stuff like this

[0m[1;37m[ERROR] [1585551547.349979]: xyz xyz.
[0m[1;37m[0;32m[ERROR] [1585551547.349979]: xyz xyz.

use pat = r'''(\[([01];)?\d+m){1,}(?=\[[A-Z]+\])'''


Some of your example strings showed color sequences in the middle of the string and you desired output showed them being replaced - contrary to your comment

all color codes in the beginning of lines.

These patterns will remove the sequence from the middle of a string.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.