If the values you want to retrieve are guaranteed to be double quoted strings, then the definition below should work. It allows for escaped quotes in strings, won't raise an exception when the key is not present, and won't give false positives if your key is a suffix of an existing key.
import re
def parser(key, string):
m = re.search(fr'(?<![A-Z]){key}="(.*?)(?<!\\)"', string)
if m:
return m.group(1)
The first part of the regex, (?<![A-Z]), is a negative look-behind expression that only matches when no character in the A-Z range matches right before your key. It ensures that you don't get false positives when you query the string with a key that is a suffix of an existing key (e.g. US, which is a suffix of STATUS).
Returning the values without the quotes is simply a matter of including the quotes in the regex but outside of the regex group that you retrieve. That's what happens in the expression "(.*?)(?<!\\)". The regex group associated to the value you want to retrieve is (.*?). The (?<!\\) expression is a negative look-behind that ensures that the " at the end only matches when it is not preceded by a backslash.
Example:
sample = r'<STATUS="OK" VERSION="B" MESSAGE="User said \"hi!\""><timestamp="1602765370" id="123">'
[parser('STATUS', sample),
parser('US', sample),
parser('MESSAGE', sample)]
Output:
['OK', None, 'User said \\"hi!\\"']