How to remove rows of a DataFrame based off of data from another DataFrame?

Question

I'm new to pandas and I'm trying to figure this scenario out: I have a sample DataFrame with two products. df =

  Product_Num     Date   Description  Price 
          10    1-1-18   Fruit Snacks  2.99
          10    1-2-18   Fruit Snacks  2.99
          10    1-5-18   Fruit Snacks  1.99
          10    1-8-18   Fruit Snacks  1.99
          10    1-10-18  Fruit Snacks  2.99
          45    1-1-18         Apples  2.99 
          45    1-3-18         Apples  2.99
          45    1-5-18         Apples  2.99
          45    1-9-18         Apples  1.49
          45    1-10-18        Apples  1.49
          45    1-13-18        Apples  1.49
          45    1-15-18        Apples  2.99

I also have another small DataFrame that looks like this (which shows promotional prices of the same products): df2=

  Product_Num   Price 
          10    1.99
          45    1.49

Notice that df2 does not contain columns 'Date' nor 'Description.' What I want to do is to remove all promo prices from df1 (for all dates that are on promo), using the data from df1. What is the best way to do this?

So, I want to see this:

  Product_Num     Date   Description  Price 
          10    1-1-18   Fruit Snacks  2.99
          10    1-2-18   Fruit Snacks  2.99
          10    1-10-18  Fruit Snacks  2.99
          45    1-1-18         Apples  2.99 
          45    1-3-18         Apples  2.99
          45    1-5-18         Apples  2.99
          45    1-15-18        Apples  2.99

I was thinking of doing a merge on columns Price and Product_Num, then seeing what I can do from there. But I was getting confused because of the multiple dates.

In my large DataFrame, the prices won't all be 2.99 @thomas.mac — Hana
– Hana, Commented Jan 30, 2018 at 23:30

BENY · Accepted Answer · 2018-01-30 23:26:47Z

9

isin with &

df.loc[~((df.Product_Num.isin(df2['Product_Num']))&(df.Price.isin(df2['Price']))),:]
Out[246]: 
    Product_Num     Date  Description  Price
0            10   1-1-18  FruitSnacks   2.99
1            10   1-2-18  FruitSnacks   2.99
4            10  1-10-18  FruitSnacks   2.99
5            45   1-1-18       Apples   2.99
6            45   1-3-18       Apples   2.99
7            45   1-5-18       Apples   2.99
11           45  1-15-18       Apples   2.99

Update

df.loc[~df.index.isin(df.merge(df2.assign(a='key'),how='left').dropna().index)]
Out[260]: 
    Product_Num     Date  Description  Price
0            10   1-1-18  FruitSnacks   2.99
1            10   1-2-18  FruitSnacks   2.99
4            10  1-10-18  FruitSnacks   2.99
5            45   1-1-18       Apples   2.99
6            45   1-3-18       Apples   2.99
7            45   1-5-18       Apples   2.99
11           45  1-15-18       Apples   2.99

edited Jan 30, 2018 at 23:26

answered Jan 30, 2018 at 23:12

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

jpp Over a year ago

won't this also catch (product=10 and price=1.49) ?

jpp Over a year ago

i like this solution. can you pls explain what df2.assign(a='key') does?

Hana Over a year ago

yes, I have the same question as @jp_data_analysis :)

BENY Over a year ago

@jp_data_analysis adding a new key , since , df2 column is subset of df, if we do left merge , it will not change anything :-) , we build a new column for df2 , and do the left merge , then we could filter the unmatched by NAN

BENY Over a year ago

@Hana, when columns df=df2 and df2 is subset of df , df.merge(df2,how='left') return the df , only if df and df2 have different in columns , we know which one from df is unmatched with df2 , then we can filter it out

piratefache · Accepted Answer · 2018-01-30 23:18:47Z

2

With Product_Num as index for both Dataframe, you can drop index from df1 for df2, then concatenate the dataframes :

import pandas as pd

df1 = pd.DataFrame({'Product_Num':[1,2,3,4], 'Date': ['01/01/2012','01/02/2013','02/03/2013','04/02/2013'], 'Price': [10,10,10,10]})
df1 = df1.set_index('Product_Num')
df2 = pd.DataFrame({'Product_Num':[2], 'Date':['03/3/2012'], 'Price': [5]})
df2 = df2.set_index('Product_Num')

Drop and concatenate:

df_new = df1.drop(df2.index)
df_new = pd.concat([df_new, df2])

Result:

               Date  Price
Product_Num                   
1            01/01/2012     10
3            02/03/2013     10
4            04/02/2013     10
2             03/3/2012      5

answered Jan 30, 2018 at 23:18

piratefache

1,36811 silver badges17 bronze badges

Comments

DJK · Accepted Answer · 2018-01-31 01:16:46Z

1

You could turn df2 into a dictionary and then filter out the values in df1

df[df[df2.columns].isin(df2.to_dict('list')).sum(1) <= 1]

Yeilds

      Date   Description  Price  Product_Num
0    1-1-18  Fruit Snacks   2.99           10
1    1-2-18  Fruit Snacks   2.99           10
4   1-10-18  Fruit Snacks   2.99           10
5    1-1-18        Apples   2.99           45
6    1-3-18        Apples   2.99           45
7    1-5-18        Apples   2.99           45
11  1-15-18        Apples   2.99           45

answered Jan 31, 2018 at 1:16

DJK

9,3424 gold badges28 silver badges41 bronze badges

Comments

Chava S · Accepted Answer · 2018-01-31 23:37:09Z

0

cute and readable

promo_prices = df2['Price']
promo_prods = df2['Product_Num']

no_pro = df

for price, prod in zip(promo_prices, promo_prods):
    no_pro = no_pro.where(df != (price or prod)).dropna()

edited Jan 31, 2018 at 23:37

answered Jan 31, 2018 at 22:32

Chava S

11 bronze badge

1 Comment

Kami Over a year ago

except for it's not considered a good practice to use loops with pandas when there are lots of other solutions because it is very slow and memory consuming

Collectives™ on Stack Overflow

How to remove rows of a DataFrame based off of data from another DataFrame?

4 Answers 4

5 Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related