Python Pandas Dataframe Datetime Range

Question

Here is my code block:

import pandas as pd
import datetime as dt
first_day = dt.date(todays_year, todays_month, 1)

print(first_day)
>2021-02-01

print(type(first_day))
>class 'datetime.date'>

My code runs successfully as below:

df = pd.read_excel('AllServiceActivities.xlsx',
                   sheet_name='All Service Activities',
                   usecols=[7, 12, 13]).query(f'Resources.str.contains("{name} {surname}")',
                                              engine='python')

Yet, I also wanna do something like this("Scheduled Start" is my column name):

df = pd.read_excel('AllServiceActivities.xlsx',
                   sheet_name='All Service Activities',
                   usecols=[7, 12, 13]).query(f'Scheduled Start >= {first_day})',
                                              engine='python')

As you can guess it does not work.

There are solutions such like: Select DataFrame rows between two dates , but I want to use "query" method because I don' t want to pass all of the irrelevant data.

Edit(In order to generate test):

dtr = [dt.datetime(2021,1,27,12,0),
dt.datetime(2021,2,3,10,0),
dt.datetime(2021,1,25,9,0),
dt.datetime(2021,1,15,7,59),
dt.datetime(2021,1,13,10,59),
dt.datetime(2021,1,12,13,59),
dt.datetime(2021,1,11,13,59),
dt.datetime(2021,2,2,9,29),
dt.datetime(2021,1,20,7,59),
dt.datetime(2021,1,19,10,59),
dt.datetime(2021,2,1,10,0),
dt.datetime(2021,1,19,7,59),
dt.datetime(2021,1,29,7,59),
dt.datetime(2021,1,28,13,0),
dt.datetime(2021,1,28,10,59),
dt.datetime(2021,1,27,19,30),
dt.datetime(2021,1,27,13,30),
dt.datetime(2021,1,18,17,30),
dt.datetime(2021,1,19,9,0),
dt.datetime(2021,1,18,13,0),
dt.datetime(2021,2,1,14,19),
dt.datetime(2021,1,29,14,30),
dt.datetime(2021,1,14,13,0),
dt.datetime(2021,1,8,13,0),
dt.datetime(2021,1,26,10,59),
dt.datetime(2021,1,25,10,0),
dt.datetime(2021,1,23,16,0),
dt.datetime(2021,1,21,10,0),
dt.datetime(2021,1,18,10,59),
dt.datetime(2021,1,11,13,30),
dt.datetime(2021,1,20,22,0),
dt.datetime(2021,1,20,21,0),
dt.datetime(2021,1,22,19,59),
dt.datetime(2021,1,12,13,59),
dt.datetime(2021,1,21,13,59),
dt.datetime(2021,1,20,10,30),
dt.datetime(2021,1,19,16,59),
dt.datetime(2021,1,19,10,0),
dt.datetime(2021,1,14,9,29),
dt.datetime(2021,1,19,8,53),
dt.datetime(2021,1,18,10,59),
dt.datetime(2021,1,13,16,0),
dt.datetime(2021,1,13,15,0),
dt.datetime(2021,1,12,13,59),
dt.datetime(2021,1,11,10,0),
dt.datetime(2021,1,8,9,0),
dt.datetime(2021,1,7,13,0),
dt.datetime(2021,1,6,13,59),
dt.datetime(2021,1,5,12,0),
dt.datetime(2021,1,10,0,0),
dt.datetime(2020,12,8,13,0),
dt.datetime(2021,1,7,11,10),
dt.datetime(2021,1,6,8,12),
dt.datetime(2021,1,5,10,0),
dt.datetime(2021,1,5,15,15),
dt.datetime(2021,1,4,7,59)]

df1= pd.DataFrame(dtr,columns=['Scheduled Start'])
df2 = df1.query("'Scheduled Start' >= @first_day")

Thanks!

mullinscr · Accepted Answer · 2021-02-04 21:56:11Z

1

Without a reproducible example it's hard to know for sure. But try this. It uses the @ character for referencing variables.

df = pd.read_excel(
    'AllServiceActivities.xlsx',
    sheet_name='All Service Activities',
    usecols=[7, 12, 13]) \
      .query('Scheduled Start >= @first_day)')

answered Feb 4, 2021 at 21:56

mullinscr

1,7681 gold badge8 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Umut TEKİN Over a year ago

I didn' t work. I added some of my data so you can generate some example. Thanks for help:).

Umut TEKİN Over a year ago

When I tried this: > df2 = df1.query("'Scheduled Start' >= @first_day", engine='python') I got this: > TypeError: '>=' not supported between instances of 'str' and 'datetime.date' So, then I tried this: > df2 = df1.query("datetime.strptime('Scheduled Start', '%Y-%m-%d %H:%M) >= @first_day", engine='python') I got this: > ValueError: time data 'Scheduled Start' does not match format '%Y-%m-%d %H:%M' Pandas evaluate datatime data type as string. I cannot change it because it' s pipelined via Excel.

Umut TEKİN Over a year ago

Also, I tried out this: > df2 = df1.query("datetime.strptime('Scheduled Start', '%Y-%m-%d %H:%M:%S') >= @first_day", engine='python') gave the same error. In order to check my string format used this: > for i in dtr: print(str(i)) print(dt.datetime.strptime(str(i),'%Y-%m-%d %H:%M:%S')) and it has just worked fine.

mullinscr Over a year ago

in the read_excel() function you need to tell pandas to import you 'Scheduled Start' column as a datetime. Something like parse_dates=[x] where x is your column number with the date (7, 12, 0r 13 from your example above). See the docs. You're getting the error because pandas doesn'y yet know that your scheduled start column is dates, so it can't compare it to other dates.

Umut TEKİN · Accepted Answer · 2021-02-06 17:52:18Z

Firstly, thanks for your guiding me @mullinscr.

From here got extra information about date_parser and parse_dates:

https://www.programcreek.com/python/example/101346/pandas.read_excel

date_parser is a specific parser function for my cases.

date_parser = lambda x: pd.datetime.strptime(str(x).split(".")[0], "%Y-%m-%d %H:%M:%S") if str(x).__contains__(".") else (pd.datetime.strptime(str(x), "%Y-%m-%d %H:%M:%S") if not str(x).__contains__("1899") else None)


df = pd.read_excel('AllServiceActivities.xlsx', sheet_name='All Service Activities', header=None, names=["Resources", "Start", "End"], skiprows=1, usecols=[7, 12, 13], parse_dates=[1], date_parser=date_parser).query("Start >= @first_day and End <= @last_day and Resources.str.contains('{} {}')".format(name, surname), engine='python')

Hope helps everyone :).

Collectives™ on Stack Overflow

Python Pandas Dataframe Datetime Range

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related