2

I have a list that I'm trying to reformat.

data = ['Height:\n      \n      6\' 4"', 'Weight:\n      \n      185 lbs.', 'Reach:\n      \n      80"', 'STANCE:\n      \n      Switch', 'DOB:\n      \n      \n        Jul 22, 1989', 'SLpM:\n          \n\n          3.93', 'Str. Acc.:\n          \n          49%', 'SApM:\n          \n          2.67', 'Str. Def:\n          \n          59%', '', 'TD Avg.:\n          \n          0.00', 'TD Acc.:\n          \n          0%', 'TD Def.:\n          \n          78%', 'Sub. Avg.:\n          \n          0.2']

I've tried using strip.

for info in data:
        info.strip('\n      \n      ')

But, I'm still getting the same output.

How would I be able to delete the whitespace of "\n \n " within each index of the list. To get the following?

data = ['Height: 6\' 4"', 'Weight: 185 lbs.', 'Reach: 80"', 'STANCE: Switch', 'DOB: Jul 22, 1989', 'SLpM: 3.93', 'Str. Acc.: 49%', 'SApM: 2.67', 'Str. Def: 59%', '', 'TD Avg.: 0.00', 'TD Acc.: 0%', 'TD Def.: 78%', 'Sub. Avg.: 0.2']
4
  • 2
    strip() returns a new string. It does not affect the original. Commented Jul 25, 2022 at 16:54
  • 1
    Plus, strip() won't remove the whitespace inside the string. Only at the ends. Commented Jul 25, 2022 at 16:56
  • Maybe build a new list and use regex: data = [re.sub(r"\s+", ' ') for info in data] Commented Jul 25, 2022 at 16:58
  • same idea @JohnnyMopp ! :) Commented Jul 25, 2022 at 16:59

3 Answers 3

2

Try this :

import re

def remove_multiple_ws(s: str) -> str:
    return re.sub(r"\s+", " ", str(s))


data = [remove_multiple_ws(s) for s in data]
Sign up to request clarification or add additional context in comments.

1 Comment

An answer that contains an explanation is much better than just "try this"
1

Here is my approach: Replace the colon and the following blank spaces with a colon and a space:

import re

pattern = re.compile(r":\s*")
new_data = [
    pattern.sub(": ", datum)
    for datum in data
]

new_data then become:

['Height: 6\' 4"',
 'Weight: 185 lbs.',
 'Reach: 80"',
 'STANCE: Switch',
 'DOB: Jul 22, 1989',
 'SLpM: 3.93',
 'Str. Acc.: 49%',
 'SApM: 2.67',
 'Str. Def: 59%',
 '',
 'TD Avg.: 0.00',
 'TD Acc.: 0%',
 'TD Def.: 78%',
 'Sub. Avg.: 0.2']

Comments

1

You can use a re.sub to substitute any duplicate spaces and more.

From the documentation:

re.sub(pattern, repl, string, count=0, flags=0)

Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged.

This is a way that re.sub could be used in this situation:

>>> import re
>>> mystring = ' string    string \t\n\n string'
>>> pattern = re.compile(r'\s+')
>>> pattern.sub(" ", mystring)
'string string string'

Using this method, an implementation for your code would look something like this:

pattern = re.compile(r"\s+")
new_data = [pattern.sub(" ",part) for part in data]

Here is what new_data should be:

kali@kali:~$ python3 test.py -i
>>> new_data
['Height: 6\' 4"',
 'Weight: 185 lbs.',
 'Reach: 80"',
 'STANCE: Switch',
 'DOB: Jul 22, 1989',
 'SLpM: 3.93',
 'Str. Acc.: 49%',
 'SApM: 2.67',
 'Str. Def: 59%',
 '',
 'TD Avg.: 0.00',
 'TD Acc.: 0%',
 'TD Def.: 78%',
 'Sub. Avg.: 0.2']

if you want to learn more about regex in python here are some useful links:

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.