I have a data frame df with shape (500000,70) and several columns including invalid dates like 4000-01-01 00:00:00. In a smaller version of this data frame I tried
df["date"] = df["date"].astype(str)
df["date"] = df["date"].replace('4000-01-01 00:00:00', pd.NaT)
which worked fine. Also the version
df["date"] = pd.to_datetime(df["date"].replace("4000-01-01 00:00:00",pd.NaT))
worked. For the long data frame version I receive the following error
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 4000-01-01 00:00:00
Any suggestions how to solve this problem in an elegant way or what the problem might be?
Thank you.