I have a dataframe which includes some "invalid" rows, which I would like to remove. I have a second dataframe which contains these invalid rows.
the invalid rows are =
DatetimeIndex(['2019-11-11', '2019-12-06', '2019-12-13', '2019-12-15',
'2019-12-17', '2019-12-18', '2019-12-19', '2019-12-31',
'2020-01-01', '2020-01-02', '2020-01-03', '2020-01-10',
'2020-01-15', '2020-01-17', '2020-01-22', '2020-02-05',
'2020-02-07', '2020-02-09', '2020-02-10', '2020-02-12',
'2020-02-14', '2020-02-19', '2020-02-20', '2020-02-21',
'2020-02-25', '2020-02-26', '2020-02-28', '2020-03-02',
'2020-03-04', '2020-03-06', '2020-03-11', '2020-03-12',
'2020-03-15', '2020-03-22', '2020-03-29', '2020-04-04',
'2020-04-11', '2020-04-13', '2020-05-13', '2020-05-23',
'2020-05-29', '2020-05-30', '2020-06-12', '2020-06-15',
'2020-06-19', '2020-06-24', '2020-06-26', '2020-07-09',
'2020-07-10', '2020-07-11', '2020-07-12', '2020-07-16',
'2020-07-17', '2020-07-18', '2020-07-20', '2020-07-23',
'2020-07-24', '2020-07-26'],
dtype='datetime64[ns]', name='dateTime', freq=None)
I want to removes these rows (dates) from:
DatetimeIndex(['2019-11-11 11:00:00', '2019-11-11 12:00:00',
'2019-11-11 13:00:00', '2019-11-11 14:00:00',
'2019-11-11 15:00:00', '2019-11-11 16:00:00',
'2019-11-11 17:00:00', '2019-11-11 18:00:00',
'2019-11-11 19:00:00', '2019-11-11 20:00:00',
...
'2020-07-26 05:00:00', '2020-07-26 06:00:00',
'2020-07-26 07:00:00', '2020-07-26 08:00:00',
'2020-07-26 09:00:00', '2020-07-26 10:00:00',
'2020-07-26 11:00:00', '2020-07-26 12:00:00',
'2020-07-26 13:00:00', '2020-07-26 14:00:00'],
dtype='datetime64[ns]', name='dateTime', length=6196, freq='H')
I tried :
df_steps1h.loc[df_steps1h.index.difference(df_valid.index), ]
and
df_steps1h[~df_steps1h.index.isin(df_valid.index)].dropna()
The DataFrames are different, so I dont want to use concat or merge. but it doesn't remove anything. Any ideas as to why ? Thanks!