I'm pretty sure I'm asking the wrong question here so here goes. I have a 2 dataframes, lets call them df1 and df2.
df1 looks like this:
data = {'Employee ID' : [12345, 23456, 34567],
'Values' : [123168546543154, 13513545435145434, 556423145613],
'Employee Name' : ['Jones, John', 'Potter, Harry', 'Watts, Wade'],
'Department Supervisor' : ['Wendy Davis', 'Albus Dumbledore', 'James Halliday']}
df1 = pd.DataFrame(data, columns=['Employee ID','Values','Employee Name','Department Supervisor'])
df2 looks similar:
data = {'Employee ID' : [12345, 23456, 34567],
'Employee Name' : ['Jones, John', 'Potter, Harry', 'Watts, Wade'],
'Department Supervisor' : ['Davis, Wendy', 'Dumbledore, Albus', 'Halliday, James']}
df2 = pd.DataFrame(data, columns=['Employee ID','Employee Name','Department Supervisor'])
My issue is that df1 is from an excel file and that sometimes has an Employee ID entered and sometimes doesn't. This is where df2 comes in, df2 is a sql pull from the employee database that I'm using to validate the employee names and supervisor names to ensure the correct employee id is used.
Normally I'd be happy to merge the dataframes to get my desired result but with the supervisor names being in different formats I'd like to use regex on df1 to turn 'Wendy Davis" into 'Davis, Wendy' along with the other supervisor names to match what df2 has. So far I'm coming up empty on how I want to search this for an answer, suggestions?