2

I can't find an elegant solution to merge those two datasets:

Let's say I have a first dataset, with temperatures of cities

       2016 2017
cityA   23  27
cityB   24  28

And another one with a lot of information, but that looks like that:

    city    year    other
0   cityA   2016    aa
1   cityB   2017    bb
2   cityA   2016    cc
3   cityB   2017    dd

And I would like the following result:

     city  year other  temperatures
0   cityA   2016    aa    23
1   cityB   2017    bb    28
2   cityA   2016    cc    23
3   cityB   2017    dd    24

Thanks for your help!

EDIT : real and more complex dataframes:

dataframe 1 with temperatures enter image description here

dataframe 2 with other datas: enter image description here

results of the implementation of the answer:

enter image description here

2 Answers 2

2

Use stack with reset_index for reshape and then merge, I think with left join:

df11 = df1.stack().reset_index()
df11.columns = ['city','year','temperatures']
#if years are strings convert to integers
df11['year'] = df11['year'].astype(int)

df = df2.merge(df11, on=['city','year'], how='left')
print (df)
    city  year other  temperatures
0  cityA  2016    aa            23
1  cityB  2017    bb            28
2  cityA  2016    cc            23
3  cityB  2017    dd            28
Sign up to request clarification or add additional context in comments.

4 Comments

thanks for your help, unfortunatly it works well on this example, but not on my real problem. I will try to add more explainations
I edited my original post with screenshot of the implementation of your solution. As you can see, there is NaN that appear after the merge
@Roger - You forget cast column year to integers.
Yes you're right! It works well with the extra line to cast to integers! Thanks a lot for your precious help Jezrael, I wish you a very nice day!
0

melt + merge

You can melt your "pivoted" dataframe, then left merge with your master dataframe. Assumes the year columns in your first dataframe are integers.

melted = pd.melt(df1.reset_index(), id_vars='index')

res = df2.merge(melted, left_on=['city', 'year'],
                right_on=['index', 'variable'], how='left')

print(res[['city', 'year', 'other', 'value']])

    city  year other  value
0  cityA  2016    aa     23
1  cityB  2017    bb     28
2  cityA  2016    cc     23
3  cityB  2017    dd     28

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.