1

I have two pandas dataframes as given below:

df1

Name   City        Postal_Code     State
James  Phoenix        85003         AZ
John   Scottsdale     85259         AZ
Jeff   Phoenix        85003         AZ
Jane   Scottsdale     85259         AZ

df2

Postal_Code       Income       Category
  85003            41038         Two
  85259            104631        Four

I would like to insert two columns, Income and Category, to df1 by capturing the values for Income and Category from df2 corresponding to the postal_code for each row in df1.

The closest question that I could find in SO was this - Fill DataFrame row values based on another dataframe row's values pandas. But, the pd.merge solution does not solve the problem for me. Specifically, I used

pd.merge(df1,df2,on='postal_code',how='outer')

All I got was nan values in the two new columns. Not sure whether this is because the no of rows for df1 and df2 are different. Any suggestions to solve this problem?

1 Answer 1

1

you just have the wrong how, use 'inner' instead. This matches where keys exist in both dataframes

df1.Postal_Code = df1.Postal_Code.astype(int)
df2.Postal_Code = df2.Postal_Code.astype(int)


df1.merge(df2,on='Postal_Code',how='inner')


    Name        City  Postal_Code State  Income Category
0  James     Phoenix        85003    AZ   41038      Two
1   Jeff     Phoenix        85003    AZ   41038      Two
2   John  Scottsdale        85259    AZ  104631     Four
3   Jane  Scottsdale        85259    AZ  104631     Four
Sign up to request clarification or add additional context in comments.

2 Comments

No luck with this either. The output is an empty dataframe.
Updated, make sure they are the same type

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.