3

I'm trying to create a dataframe (e.g., df3) that overwrites salary information onto people's names. I currently working with df1 with a list of around 1,000 names. Here's an example of what df1 looks like.

    print df1.head() 

               Salary
    Name       
    Joe Smith  8700
    Jane Doe   6300
    Rob Dole   4700
    Sue Pam    2100
    Jack Li    3400

I also have df2, which randomly assigns people from df1 as either Captain and Skipper columns.

    print df2.head()

    Captain    Skipper
    Sue Pam    Joe Smith
    Jane Doe   Sue Pam
    Rob Dole   Joe Smith
    Joe Smith  Sue Pam
    Rob Dole   Jack Li

How can I replace the names in df2 with their corresponding salaries so that I have this exact format below. In excel, I would use a VLOOKUP function, but I'm not sure how to accomplish this using Python.

print df3.head()   

    Captain    Skipper
    2100       8700
    6300       2100
    4700       8700
    8700       2100
    4700       3400

4 Answers 4

1

You could use df2[col].map(df1['Salary']) to map each column of df2 according to df1['Salary']:

import pandas as pd
df1 = pd.DataFrame({'Salary':[8700,6300,4700,2100,3400]}, index=pd.Series(['Joe Smith', 'Jane Doe', 'Rob Dole', 'Sue Pam', 'Jack Li'], name='Name'))
df2 = pd.DataFrame({'Captain':['Sue Pam', 'Jane Doe',  'Rob Dole', 'Joe Smith', 'Rob Dole'], 'Skipper': ['Joe Smith',  'Sue Pam', 'Joe Smith', 'Sue Pam', 'Jack Li']})

df3 = pd.DataFrame({col: df2[col].map(df1['Salary']) for col in df2})
print(df3)

yields

   Captain  Skipper
0     2100     8700
1     6300     2100
2     4700     8700
3     8700     2100
4     4700     3400
Sign up to request clarification or add additional context in comments.

Comments

1

You can lookup the salary for each name in df1 with df1.loc[name, 'Salary']. Using .applymap(), you can do this for all entries in all columns of df2:

df3 = df2.applymap(lambda x: df1.loc[x, 'Salary'])
print(df3)

Result:

    Captain  Skipper
0     2100     8700
1     6300     2100
2     4700     8700
3     8700     2100
4     4700     3400

1 Comment

Worked, as well. Thank you!
1

I'd leverage the pd.Series.map feature combined with stack to make df2 a series. Then unstack back.

  • stack df2 to make it a series
  • use map with df1.Salary
  • unstack to get back your dataframe

df2.stack().map(df1.Salary).unstack()

   Captain  Skipper
0     2100     8700
1     6300     2100
2     4700     8700
3     8700     2100
4     4700     3400

1 Comment

Worked great. Thank you!
0

dataframe.replace method is more suitable than doing lookup.

>>> df1 = pd.DataFrame({'Salary':[8700,6300,4700,2100,3400]}, index=pd.Series(['Joe Smith', 'Jane Doe', 'Rob Dole', 'Sue Pam', 'Jack Li'], name='Name'))
>>> df2 = pd.DataFrame({'Captain':['Sue Pam', 'Jane Doe',  'Rob Dole', 'Joe Smith', 'Rob Dole'], 'Skipper': ['Joe Smith',  'Sue Pam', 'Joe Smith', 'Sue Pam', 'Jack Li']})
>>> df2.replace(df1.Salary)
   Captain  Skipper
0     2100     8700
1     6300     2100
2     4700     8700
3     8700     2100
4     4700     3400

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.