0

I have a pandas dataframe with a column for years and one for months. How can I create a new date column based on these two (I can assume day = 15).

I tried the following:

import pandas as pd
import numpy as np
import datetime  

df = pd.DataFrame()
df['year'] = np.arange(2000,2010)

df['mydate']= datetime.date( df['year'].apply(lambda x: int(x)) , 1 , 1)

but I get this error message:

df['mydate']= datetime.date( df['year'].apply(lambda x: int(x)) , 1 , 1)   File "C:\Anaconda\lib\site-packages\pandas\core\series.py",

line 77, in wrapper "cannot convert the series to {0}".format(str(converter))) TypeError: cannot convert the series to

which I don't understand because I explictly convert x to int.

Thanks!

1 Answer 1

2

You can build another column based on the existing columns by using df.apply(fnc, axis=1).

In your case this becomes:

df = pd.DataFrame()
df['year'] = np.arange(2000,2010)
df['month'] = 6

df['date_time']= df.apply(lambda row :
                          datetime.date(row.year,row.month,15), 
                          axis=1)
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Was the error with my code because datetime.date works on a single row, and with functions working on rows one at a time I hvae to do apply( fun() ) rather than fun ( apply() ) ?
You were supplying a DataFrame as an argument to datetime.date(), which doesn't make sense. As date() doesn't know what to do with it, it apparently tries to convert it to an int. In my example you call datetime on three integers. I.e. the lambda receives a row (a Series) as input, and outputs a date object, which is then aggregated by df.apply to a Series which is returned.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.