I'm attempting to do this:
p = re.compile(ur'([A-Z]\w+\s+[A-Z]\w+)|([A-Z]\w+)(?=\s+and\s+[A-Z]\w+\s+([A-Z]\w+))', re.MULTILINE)
test_str = u"Russ Middleton and Lisa Murro\nRon Iervolino, Trish and Russ Middleton, and Lisa Middleton \nRon Iervolino, Kelly and Tom Murro\nRon Iervolino, Trish and Russ Middleton and Lisa Middleton "
subst = u"$1$2 $3"
result = re.sub(p, subst, test_str)
The goal is to get something that both matches all the names and fills in last names when necessary (e.g., Trish and Russ Middleton becomes Trish Middleton and Russ Middleton). In the end, I'm looking for the names that appear together in a single line.
Someone else was kind enough to help me with the regex, and I thought I knew how to write it programmatically in Python (although I'm new to Python). Not being able to get it, I resorted to using the code generated by Regex101 (the code shown above). However, all I get in result is:
u'$1$2 $3 and $1$2 $3\n$1$2 $3, $1$2 $3 and $1$2 $3, and $1$2 $3 \n$1$2 $3, $1$2 $3 and $1$2 $3\n$1$2 $3, $1$2 $3 and $1$2 $3 and $1$2 $3 '
What am I missing with Python and regular expressions?