1

I am currently writing a script where I want to drop some rows of my pandas dataframe according to Datetime values over several years (I want to drop rows where datetime is between February and May. So, I first tried the following code:

game_df['Date'] = game_df[(game_df['Date'].dt.month < 2) & (game_df['Date'].dt.month > 5)]

It gave me the same dataframe with NaN values in the 'Date' column over this period of time. So I tried the following code in order to drop the corresponding rows:

game_df['Date'] = game_df[(game_df['Date'].dt.month < 2) & (game_df['Date'].dt.month > 5)].drop(game_df.columns)

But it raised an error like: labels [u'Date' u'other_column1' u'other_column2' u'other_column3' u'other_column4'] not contained in axis

Does anyone can solve this problem?

3 Answers 3

4

I think you could try something like this using a list of Timestamps:

If you want to exclude rows with specific dates:

game_df[~game_df['Date'].isin([pd.Timestamp('20150210'), pd.Timestamp('20150301')])]

The ~ is a not operator at the beginning of game_df in case you're not familiar with it. So it's saying to return the dataframe where the timestamps are not the two dates mentioned.

Edit: If you want to exclude a range of rows between specific dates:

game_df[~game_df['Date'].isin(pd.date_range(start='20150210', end='20150301'))]
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for your help. Does it return the dataframe without the dates or without dates between those dates (included)? Because actually you have to considers the dates I gave like a time range and I want to exclude obsevations whose 'Date' values are within this time of range (February, 10 - March, 1st).
Do you want to exclude specific days or a range of days?
Yes I want to exclude a range of days
The code in your example shows you trying to return a game_df where the month is not February OR day is less than 10. I was confused. Depending on your date range, that could exclude days you want that are not in February. I updated my answer.
I made some mistakes when exposing my problem so I made changes in my original post to help you understanding my problem. I want to drop from my dataframe rows whose datetime values are between February and May (range of time).
1

Actually, I've found what I was looking for with the following code:

game_df = game_df[(game_df['Date'].dt.month != 2) & (game_df['Date'].dt.month != 3) & (game_df['Date'].dt.month != 4)\
                      & (game_df['Date'].dt.month != 5)]

It is pretty ugly and I truly think it can be done with a more efficient way but it works when it comes to exclude rows whose datetime values are located in a span of time.

1 Comment

It seems like your initial error was because you used &. The same month cannot be smaller than 2 and larger than 5 at the same time. Instead of & try | operator (it is element-wise OR).
0

Instead of dropping, I find query much more helpful. But you need to change arguments of course to include part of the data you want to keep.

df.query("Date.dt.month < 2 & Date.dt.month > 5", inplace=True)

if you want to use exact dates:

df.query("Date <= '2017-01-31' & Date >= '2017-05-01' ", inplace=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.