0

I'm trying to create a new date column based on an existing date column in my dataframe. I want to take all the dates in the first column and make them the first of the month in the second column so:

03/15/2019 = 03/01/2019

I know I can do this using:

df['newcolumn'] = pd.to_datetime(df['oldcolumn'], format='%Y-%m-%d').apply(lambda dt: dt.replace(day=1)).dt.date

My issues is some of the data in the old column is not valid dates. There is some text data in some of the rows. So, I'm trying to figure out how to either clean up the data before I do this like:

if oldcolumn isn't a date then make it 01/01/1990 else oldcolumn

Or, is there a way to do this with try/except?

Any assistance would be appreciated.

2 Answers 2

2

At first we generate some sample data:

df = pd.DataFrame([['2019-01-03'], ['asdf'], ['2019-11-10']], columns=['Date'])

This can be safely converted to datetime

df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
mask = df['Date'].isnull()
df.loc[mask, 'Date'] = dt.datetime(1990, 1, 1)

Now you don't need the slow apply

df['New'] = df['Date'] + pd.offsets.MonthBegin(-1)
Sign up to request clarification or add additional context in comments.

1 Comment

This is a more elegant solution, to be sure, but I wouldn't call apply "slow", necessarily. See e.g.: geeksforgeeks.org/…
1

Try with the argument errors=coerce. This will return NaT for the text values.

df['newcolumn'] = pd.to_datetime(df['oldcolumn'], 
                                 format='%Y-%m-%d', 
                                 errors='coerce').apply(lambda dt: dt.replace(day=1)).dt.date

For example

# We have this dataframe
    ID        Date
0  111  03/15/2019
1  133  01/01/2019
2  948       Empty
3  452  02/10/2019

# We convert Date column to datetime
df['Date'] = pd.to_datetime(df.Date, format='%m/%d/%Y', errors='coerce')

Output

    ID       Date
0  111 2019-03-15
1  133 2019-01-01
2  948        NaT
3  452 2019-02-10

3 Comments

That looks like a great solution, but I get the following error when trying to implement it: TypeError: to_datetime() got an unexpected keyword argument 'error'
You misspelled error you forgot the S, its errors.
ah...thank you for clarifying and for your help in general. much appreciated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.