3

I have a 'Posting Date' column in the dataframe in the format of '2017-03-01'. The type is <datetime64>[ns]. And I want to change the value if it is after '2017-03-31' to '2017-03-31', and all others remain unchanged.

When I type df['Posting Date']>'2017-03-31',it can correctly show me all the rows where the condition is met. So I guess the date filtering function works.

However, when I used numpy.where to write the condition as this:

df['Posting Date'] = np.where(df['Posting Date']>'2017-03-31','2017-03-31,'df['Posting Date'])

it incurrs an invalid type promotion error.

I also tried df.loc and the same error occers.

df.loc[df['Posting Date']>'2017-03-31','Posting Date']='2017-03-31'

ValueError: invalid literal for int() with base 10: '2017-03-31'

I'm wondering why the error occurs. How can I replace date correctly? Whatever method which works is fine.

2
  • I think you've got a cut-and-paste error. '2017-03-31,'df['Posting Date'] is a syntax error. (Presumably the comma should be outside the quotes.) If this is actually correct numpy syntax, my apologies. Commented Oct 15, 2017 at 2:19
  • I haven’t tried it but you can try. df[‘Posting Date’].clip(upper=pd.Timestamp(‘2017-03-31’)) Commented Oct 15, 2017 at 2:35

1 Answer 1

1

Its because of are trying to replace datetime with string in datetime dtype column so pass a datetime in np.where i.e

df['Posting Date'] = np.where(df['Posting Date']>'2017-03-31',pd.to_datetime(['2017-03-31']),df['Posting Date'])

Example output :

df = pd.DataFrame({'Posting Date': pd.to_datetime(['20-4-2017','20-4-2017','20-4-2017','20-3-2017','20-2-2017'])})
df['Posting Date'] = np.where(df['Posting Date']>'2017-03-31',pd.to_datetime(['2017-03-31']),df['Posting Date'])

Output :

Posting Date
0   2017-03-31
1   2017-03-31
2   2017-03-31
3   2017-03-20
4   2017-02-20

Better one posted by @pirSquared in comment using clip i.e

df['Posting Date'] = df['Posting Date'].clip(upper=pd.Timestamp('2017-03-31')) 
Sign up to request clarification or add additional context in comments.

2 Comments

Glad to help @LavenderPan
For someone that might land here having the same problem as I did. Was doing the where almost like what Dark said but all values that were not supposed to be replaced were being set as integer numbers instead of Datetime as they were before. What I have missed was the [ ] nesting the string date inside the pd.to_datetime(). The why I don't know.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.