A regex solution:
r'^([^[]+)(?:\[([^\]]+)])?$'
^ Matches start of string.
([^[]+) Capture group 1: matches 1 or more characters that are not '['.
(?: Start of non-capturing group.
\[ Matches '['.
([^\]]+) Capture group 2: matches 1 or more characters that are not ']'.
] Matches ']'
) End of non-capturing group.
- '?' The non-capturing group is optional.
import re
tests = ['john', 'john[doe]']
for test in tests:
m = re.match(r'^([^[]+)(?:\[([^\]]+)])?$', test)
if m:
print(test, '->', m[1], m[2])
Prints:
john -> john None
john[doe] -> john doe
Explanations
First, anything between parentheses ( ) is a capturing group. Anything between (?: ) is a non-capturing group. Either of these types of groups can contain capturing an non-capturing groups within. [] is used to define a set of characters. For example, [aqw] matches 'a', 'q' or 'w'. [a-e] matches 'a', 'b', 'c', 'd' or 'e'. [^aqw] with a leading ^ negates the set meaning it matches any character other than 'a', 'q', 'w'. So, [^\]] matches any character other than ']' (you have to put a \ character in front of the ] character to "escape" it because in that context ] has special meaning (it would otherwise close the [] construct). The following + sign denotes "one or more of what preceded this". So ([^[]+) matches one or more of nay character that is not a [.
I hope the preceding explanations help.