I had a problem and I found a solution but I feel it's the wrong way to do it. Maybe, there is a more 'canonical' way to do it.
I already had an answer for a really similar problem, but here I have not the same amount of rows in each dataframe. Sorry for the "double-post", but the first one is still valid so I think it's better to make a new one.
Problem
I have two dataframe that I would like to merge without having extra column and without erasing existing infos. Example :
Existing dataframe (df)
A A2 B
0 1 4 0
1 2 5 1
2 2 5 1
Dataframe to merge (df2)
A A2 B
0 1 4 2
1 3 5 2
I would like to update df with df2 if columns 'A' and 'A2' corresponds.
The result would be :
A A2 B
0 1 4 2 <= Update value ONLY
1 2 5 1
2 2 5 1
Here is my solution, but I think it's not a really good one.
import pandas as pd
df = pd.DataFrame([[1,4,0],[2,5,1],[2,5,1]],columns=['A','A2','B'])
df2 = pd.DataFrame([[1,4,2],[3,5,2]],columns=['A','A2','B'])
df = df.merge(df2,on=['A', 'A2'],how='left')
df['B_y'].fillna(0, inplace=True)
df['B'] = df['B_x']+df['B_y']
df = df.drop(['B_x','B_y'], axis=1)
print(df)
I tried this solution :
rows = (df[['A','A2']] == df2[['A','A2']]).all(axis=1)
df.loc[rows,'B'] = df2.loc[rows,'B']
But I have this error because of the wrong number of rows :
ValueError: Can only compare identically-labeled DataFrame objects
Does anyone has a better way to do ? Thanks !
