I can easily convert a string to date in pandas as shown here...
df.date = pd.to_datetime(df.date, format="%m/%d/%Y")
There seems to be no easy way in dask?
Here is the pandas example that works with dates:
import pandas as pd
url="http://web.mta.info/developers/data/nyct/turnstile/turnstile_170128.txt"
df=pd.read_csv(url)
df.info()
df.columns=['ca', 'unit', 'scp', 'station', 'inename', 'division', 'date', 'time', 'desc', 'entries', 'exits']
df.date = pd.to_datetime(df.date, format="%m/%d/%Y")
And here is dask that works but can not convert string:
link = 'http://web.mta.info/developers/'
data = ['data/nyct/turnstile/turnstile_170128.txt',
'data/nyct/turnstile/turnstile_170121.txt',
'data/nyct/turnstile/turnstile_170114.txt',
'data/nyct/turnstile/turnstile_170107.txt'
]
urls=[]
for i in data:
urls.append(link+i)
import pandas as pd
import dask
import dask.dataframe as dd
ddfs = [dask.delayed(pd.read_csv)(url) for url in urls]
ddf = dd.from_delayed(ddfs)
ddf.columns=['ca', 'unit', 'scp', 'station', 'inename', 'division', 'date', 'time', 'desc', 'entries', 'exits']
How do I convert the string to date?