How to remove rows from pandas dataframe with an initial date condition

Question

I have a pandas dataframe, one of the columns of which contains dates.

My objective is to set an initial date, and discard all the rows of the dataframe that are previous to this date. Snippet of dataframe:

 ID         fecha         
519457    25/02/2020 10:03    
519462    25/02/2020 10:07     
519468    25/02/2020 10:12
 ...           ...

The code I have been trying to use is the following:

xls=pd.ExcelFile(r'/home/.../Final.xlsx')
xls.sheet_names
df=pd.read_excel(xls,"Hoja1")
Date_initial=['25/02/2020 10:07:00']
df=df.drop(df[["fecha"]<Date_initial].index)

Which did not work. I also tried substituing the last line for:

df[(df['fecha']>=Date_initial)]

As a result, I obtained the error:

ValueError: Lengths must match to compare

Am I missing something in the expression, or going in a completely wrong way to doing this? Thanks for your input!

Quang Hoang · Accepted Answer · 2020-06-30 15:27:43Z

1

May be something like this:

Date_initial='25/02/2020 10:07:00'
df=df[df["fecha"]>=Date_initial]]

Also, I recommend using datetime type:

df = pd.read_excel(xls, 'Hoja1', parse_dates=['fecha'], dayfirst=True)

Date_initial = pd.to_datetime('25/02/2020 10:07:00')
df = df[df['fecha'] >= Date_initial]

answered Jun 30, 2020 at 15:27

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

enricw Over a year ago

This did the trick! Thanks a lot. PS. I removed an extra bracket from your answer.

Collectives™ on Stack Overflow

How to remove rows from pandas dataframe with an initial date condition

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related