Remove rows in Dataframe if they match second dataframe based on index

Question

I have a dataframe which includes some "invalid" rows, which I would like to remove. I have a second dataframe which contains these invalid rows.

the invalid rows are =

DatetimeIndex(['2019-11-11', '2019-12-06', '2019-12-13', '2019-12-15',
           '2019-12-17', '2019-12-18', '2019-12-19', '2019-12-31',
           '2020-01-01', '2020-01-02', '2020-01-03', '2020-01-10',
           '2020-01-15', '2020-01-17', '2020-01-22', '2020-02-05',
           '2020-02-07', '2020-02-09', '2020-02-10', '2020-02-12',
           '2020-02-14', '2020-02-19', '2020-02-20', '2020-02-21',
           '2020-02-25', '2020-02-26', '2020-02-28', '2020-03-02',
           '2020-03-04', '2020-03-06', '2020-03-11', '2020-03-12',
           '2020-03-15', '2020-03-22', '2020-03-29', '2020-04-04',
           '2020-04-11', '2020-04-13', '2020-05-13', '2020-05-23',
           '2020-05-29', '2020-05-30', '2020-06-12', '2020-06-15',
           '2020-06-19', '2020-06-24', '2020-06-26', '2020-07-09',
           '2020-07-10', '2020-07-11', '2020-07-12', '2020-07-16',
           '2020-07-17', '2020-07-18', '2020-07-20', '2020-07-23',
           '2020-07-24', '2020-07-26'],
          dtype='datetime64[ns]', name='dateTime', freq=None)

I want to removes these rows (dates) from:

DatetimeIndex(['2019-11-11 11:00:00', '2019-11-11 12:00:00',
           '2019-11-11 13:00:00', '2019-11-11 14:00:00',
           '2019-11-11 15:00:00', '2019-11-11 16:00:00',
           '2019-11-11 17:00:00', '2019-11-11 18:00:00',
           '2019-11-11 19:00:00', '2019-11-11 20:00:00',
           ...
           '2020-07-26 05:00:00', '2020-07-26 06:00:00',
           '2020-07-26 07:00:00', '2020-07-26 08:00:00',
           '2020-07-26 09:00:00', '2020-07-26 10:00:00',
           '2020-07-26 11:00:00', '2020-07-26 12:00:00',
           '2020-07-26 13:00:00', '2020-07-26 14:00:00'],
          dtype='datetime64[ns]', name='dateTime', length=6196, freq='H')

I tried :

df_steps1h.loc[df_steps1h.index.difference(df_valid.index), ]

and

df_steps1h[~df_steps1h.index.isin(df_valid.index)].dropna()

The DataFrames are different, so I dont want to use concat or merge. but it doesn't remove anything. Any ideas as to why ? Thanks!

Anant Kumar · Accepted Answer · 2020-08-01 17:56:30Z

1

Considering df as the invalid rows DataFrame and df_valid as the original DataFrame from which you want to remove.

df_valid.loc[:,"actual_index"]=df_valid.index
df_valid.loc[:,"actual_index"]=df_valid.loc[:,"actual_index"].apply(lambda x: datetime.strftime(x,'%Y-%m-%d'))
df_valid.loc[:,"actual_index"]=pd.to_datetime(df_valid.loc[:,"actual_index"])
df_valid=df_valid[~df_valid.actual_index.isin(df.index)]
df_valid.drop('actual_index', inplace=True, axis=1)

In the mentioned query, though the index of the DataFrame is of type DatetimeIndex but the values are significantly different from the other DataFrame based on Frequency.

The solution aims at converting it to a similar frequency and hence perform operation.

edited Aug 1, 2020 at 17:56

answered Aug 1, 2020 at 17:52

Anant Kumar

6415 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

MyTivoli Over a year ago

Hello! Thanks for answering, I tried running your code, but it doesn't remove any rows :/

Anant Kumar Over a year ago

Please try after the edit. I've changed this line df_valid=df_valid[~df_valid.actual_index.isin(df.index)]

MyTivoli Over a year ago

now I get :A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: pandas.pydata.org/pandas-docs/stable/user_guide/… errors=errors,

MyTivoli Over a year ago

But I do also still need the dataframe to contain info on every hour

MyTivoli Over a year ago

Hey! I changed df_valid=df_valid[~df_valid.index.isin(df.index)] and it works

|

Collectives™ on Stack Overflow

Remove rows in Dataframe if they match second dataframe based on index

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related