Python Pandas: Concatenate and update dataframe values from another dataframe

Question

I have this 2 dataframes.

import pandas as pd

data1 = {
         'Product': ['product1', 'product2', 'product3'],
         'Price': [200, 300, 400],
         'Quantity': [10, 5, 20],
    }
df1 = pd.DataFrame(data1, columns= ['Product','Price','Quantity'])

print(df1)

data2 = {
         'Product': ['product1','product2','product4'],
         'Price': [200, 1000,50],
}

df2 = pd.DataFrame(data2, columns= ['Product','Price'])

df1:

    Product  Price  Quantity
0  product1    200        10
1  product2    300         5
2  product3    400        20

df2:

    Product  Price
0  product1    200
1  product2   1000
2  product4     50

I search for concatening and updating both to obtain this dataframe:

    Product  Price  Quantity
0  product1    200       10
1  product2   1000       5
2  product3     -1       20
2  product4     50       NaN

This means that:

New product in df2 (product4) has to be added with the available information (Price)
Product which is not in df2 should be kept with Price set to -1
Product in df1 and df2 has to only has his price updated (product2)
All other Products are kept the same.

Thank you for your help.

Product which is not in df2 should be kept with Price set to -1 Worth mentioning that this implies that the price in df1 never affects the output dataframe. Either df2 overrides it, or df2 doesn't include it, so it will be set to -1. Is that intended? — Nick ODell
– Nick ODell, Commented Jan 22, 2023 at 17:19

Tom · Accepted Answer · 2023-01-22 17:22:16Z

1

Here's a merge based solution:

Get the Quantity value from df1 by merging the two DataFrames on the Product. The Price from df1 is dropped so that those prices aren't added to the final DataFrame. This is an outer merge to ensure the result has products from both df1 and df2.
The above step almost gets you to the desired result, except for needing to replace missing prices with -1 and sorting based on product.

final = df2.merge(df1.drop('Price', axis=1), on='Product', how='outer')
final['Price'].fillna(-1, inplace=True)
final.sort_values('Product', inplace=True)

Result:

    Product   Price  Quantity
0  product1   200.0      10.0
1  product2  1000.0       5.0
3  product3    -1.0      20.0
2  product4    50.0       NaN

edited Jan 22, 2023 at 17:22

answered Jan 22, 2023 at 17:17

Tom

8,8402 gold badges20 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python Pandas: Concatenate and update dataframe values from another dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related