Based on what you said you want and the fact that you said, "I have the string":
I have the string
'Aaa Bbb', 'AaaBbbCcc' ,'OneTwost.Three'
These should do it.
Input:
>>> import re
>>> string = """'Aaa Bbb', 'AaaBbbCcc' ,'OneTwost.Three'"""
Output:
>>> re.sub(r'((?<![\',\s])[A-Z]+|[\S]{2}\.)', r' \1', string)
"'Aaa Bbb', 'Aaa Bbb Ccc' ,'One Two st. Three'"
.
Edit
Input (Acting on string and new variable string_1 which removes the ''s)
>>> import re
>>> string = """'Aaa Bbb', 'AaaBbbCcc' ,'OneTwost.Three'"""
>>> string_1 = """Aaa Bbb, AaaBbbCcc ,OneTwost.Three"""
Output
>>> re.sub(r'(?<!^)(?<!,)(?<!\s)(?<!\')([A-Z]+|[\S]{2}\.)', r' \1', string)
"'Aaa Bbb', 'Aaa Bbb Ccc' ,'One Two st. Three'"
>>> re.sub(r'(?:(?<!^)(?<!,)(?<!\s)(?<!\'))([A-Z]+|[\S]{2}\.)', r' \1',
string)
"'Aaa Bbb', 'Aaa Bbb Ccc' ,'One Two st. Three'"
>>> re.sub(r'(?<!^)(?<!,)(?<!\s)(?<!\')([A-Z]+|[\S]{2}\.)', r' \1', string_1)
'Aaa Bbb, Aaa Bbb Ccc ,One Two st. Three'
>>> re.sub(r'(?:(?<!^)(?<!,)(?<!\s)(?<!\'))([A-Z]+|[\S]{2}\.)', r' \1', string_1)
'Aaa Bbb, Aaa Bbb Ccc ,One Two st. Three'
.
Explanation of the First:
- Made it a string as your quote suggested
- Using re.sub in this situation with the raw_string (r) option to allow for printing of dynamic/changing/variable capturing functionality and will return an edited string
- With the first
"(" I'm setting it up to capture everything in the subsequent query
- With
"(?<![\',\s])" I'm saying make sure that what follows which I am trying to capture is not preceded by a " ' " or "whitespace"
- With
"[A-Z]+" positioned where it is, I am saying capture any group of capital letters (BUT NOTE: This will also match ABC, SDZ, FFRD, ZXF, etc. but will not capture any lowercase letters or other symbols)
- With
"|" I'm telling the re engine, "OR" capture this next query
- And with
"[\S]{2}\." I'm saying capture if you find any 2 "non-whitespace characters" followed by a "."
- The final
")" ends the capture group directive
- .
- With the second argument "r' \1'" I'm saying print the first group you capture (in this case I only have 1 capture group anyway) and place a single space in front of it
Edit: Slight Explanation of the Following 2 which can act on string_1
I swear, re.sub's behavior with lookarounds is just wonky. Given your comment below. Through each of the (?<!YOUR_IGNORED_CHARACTER), I'm telling re.sub to essentially not capture if the capital letters are preceded by the designated character. (?<!^), however, means do not capture if the capture group occurs at the beginning of the line.
Note also, in the string for this example I've removed the ' from the one you had given.