0

I'm pretty sure I'm asking the wrong question here so here goes. I have a 2 dataframes, lets call them df1 and df2.

df1 looks like this:

data = {'Employee ID' : [12345, 23456, 34567],
        'Values' : [123168546543154, 13513545435145434, 556423145613],
        'Employee Name' : ['Jones, John', 'Potter, Harry', 'Watts, Wade'],
        'Department Supervisor' : ['Wendy Davis', 'Albus Dumbledore', 'James Halliday']}
df1 = pd.DataFrame(data, columns=['Employee ID','Values','Employee Name','Department Supervisor'])

df2 looks similar:

data = {'Employee ID' : [12345, 23456, 34567],
        'Employee Name' : ['Jones, John', 'Potter, Harry', 'Watts, Wade'],
        'Department Supervisor' : ['Davis, Wendy', 'Dumbledore, Albus', 'Halliday, James']}
df2 = pd.DataFrame(data, columns=['Employee ID','Employee Name','Department Supervisor'])

My issue is that df1 is from an excel file and that sometimes has an Employee ID entered and sometimes doesn't. This is where df2 comes in, df2 is a sql pull from the employee database that I'm using to validate the employee names and supervisor names to ensure the correct employee id is used.

Normally I'd be happy to merge the dataframes to get my desired result but with the supervisor names being in different formats I'd like to use regex on df1 to turn 'Wendy Davis" into 'Davis, Wendy' along with the other supervisor names to match what df2 has. So far I'm coming up empty on how I want to search this for an answer, suggestions?

3 Answers 3

1

IIUC, do you need?

df1['DS Corrected'] = df1['Department Supervisor'].str.replace('(\w+) (\w+)','\\2, \\1', regex=True)

Output:

   Employee ID             Values  Employee Name Department Supervisor       DS Corrected
0        12345    123168546543154    Jones, John           Wendy Davis       Davis, Wendy
1        23456  13513545435145434  Potter, Harry      Albus Dumbledore  Dumbledore, Albus
2        34567       556423145613    Watts, Wade        James Halliday    Halliday, James
Sign up to request clarification or add additional context in comments.

1 Comment

@Cwnosky Happy coding. Be safe and stay healthy.
1

Since Albus' full name is Albus Percival Wulfric Brian Dumbledore and James' is James Donovan Halliday (if we're talking about Ready Player One) then consider a dataframe of:

    Employee ID     Values              Employee Name       Department Supervisor
0   12345           123168546543154     Jones, John         Wendy Davis
1   23456           13513545435145434   Potter, Harry       Albus Percival Wulfric Brian Dumbledore
2   34567           556423145613        Watts, Wade         James Donovan Halliday

So we need to swap the last name to the front with...

import pandas as pd

data = {'Employee ID' : [12345, 23456, 34567],
        'Values' : [123168546543154, 13513545435145434, 556423145613],
        'Employee Name' : ['Jones, John', 'Potter, Harry', 'Watts, Wade'],
        'Department Supervisor' : ['Wendy Davis', 'Albus Percival Wulfric Brian Dumbledore', 'James Donovan Halliday']}
df1 = pd.DataFrame(data, columns=['Employee ID','Values','Employee Name','Department Supervisor'])

def swap_names(text):
    first, *middle, last = text.split()
    if len(middle) == 0:
        return last + ', ' + first
    else:
        return last + ', ' + first  + ' ' + ' '.join(middle)

df1['Department Supervisor'] = [swap_names(row) for row in df1['Department Supervisor']]

print(df1)

Outputs:

    Employee ID     Values              Employee Name   Department Supervisor
0   12345           123168546543154     Jones, John     Davis, Wendy
1   23456           13513545435145434   Potter, Harry   Dumbledore, Albus Percival Wulfric Brian
2   34567           556423145613        Watts, Wade     Halliday, James Donovan

1 Comment

Always enjoyable when someone gets my references!
0

Maybe...

df1['Department Supervisor'] = [', '.join(x.split()[::-1]) for x in df1['Department Supervisor']]

Outputs:

    Employee    ID  Values          Employee Name       Department Supervisor
0   12345       123168546543154     Jones, John         Davis, Wendy
1   23456       13513545435145434   Potter, Harry       Dumbledore, Albus
2   34567       556423145613        Watts, Wade         Halliday, James

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.