I have this 2 dataframes.
import pandas as pd
data1 = {
'Product': ['product1', 'product2', 'product3'],
'Price': [200, 300, 400],
'Quantity': [10, 5, 20],
}
df1 = pd.DataFrame(data1, columns= ['Product','Price','Quantity'])
print(df1)
data2 = {
'Product': ['product1','product2','product4'],
'Price': [200, 1000,50],
}
df2 = pd.DataFrame(data2, columns= ['Product','Price'])
df1:
Product Price Quantity
0 product1 200 10
1 product2 300 5
2 product3 400 20
df2:
Product Price
0 product1 200
1 product2 1000
2 product4 50
I search for concatening and updating both to obtain this dataframe:
Product Price Quantity
0 product1 200 10
1 product2 1000 5
2 product3 -1 20
2 product4 50 NaN
This means that:
- New product in df2 (product4) has to be added with the available information (Price)
- Product which is not in df2 should be kept with Price set to -1
- Product in df1 and df2 has to only has his price updated (product2)
- All other Products are kept the same.
Thank you for your help.
Product which is not in df2 should be kept with Price set to -1Worth mentioning that this implies that the price in df1 never affects the output dataframe. Either df2 overrides it, or df2 doesn't include it, so it will be set to -1. Is that intended?