1

I have a datatime dataframe. I want to compare it with a reference date and assign before it is less than and after if greater.
My code:

df = pd.DataFrame({'A':np.arange(1.0,9.0)},index=pd.date_range(start='2020-05-04 08:00:00', freq='1d', periods=8))
df=     
                       A
2020-05-04 08:00:00  1.0
2020-05-05 08:00:00  2.0
2020-05-06 08:00:00  3.0
2020-05-07 08:00:00  4.0
2020-05-08 08:00:00  5.0
2020-05-09 08:00:00  6.0
2020-05-10 08:00:00  7.0
2020-05-11 08:00:00  8.0

ref_date = '2020-05-08'

Expected answer

df=     
                       A    Condi.
2020-05-04 08:00:00  1.0    Before
2020-05-05 08:00:00  2.0    Before
2020-05-06 08:00:00  3.0    Before
2020-05-07 08:00:00  4.0    Before
2020-05-08 08:00:00  5.0    After
2020-05-09 08:00:00  6.0    After
2020-05-10 08:00:00  7.0    After
2020-05-11 08:00:00  8.0    After

My solution:

df['Cond.'] = = ['After' if df.index>=(ref_date)=='True' else 'Before cleaning']

Present answer

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
1
  • df.index>=(ref_date)=='True no. First, boolean objects would be True not 'True', but you never need to use == True. Perhaps more importantly, you are creating a list with a single element, although, that is failing because you are doing the vectorized operation on df.index which creates a boolean array, hence the error message Commented May 11, 2021 at 0:09

2 Answers 2

5
ref_date = "2020-05-08"

df["Cond"] = np.where(df.index < ref_date, "Before", "After")
print(df)

Prints:

                       A    Cond
2020-05-04 08:00:00  1.0  Before
2020-05-05 08:00:00  2.0  Before
2020-05-06 08:00:00  3.0  Before
2020-05-07 08:00:00  4.0  Before
2020-05-08 08:00:00  5.0   After
2020-05-09 08:00:00  6.0   After
2020-05-10 08:00:00  7.0   After
2020-05-11 08:00:00  8.0   After
Sign up to request clarification or add additional context in comments.

5 Comments

Mine is a big df. For my original df, I got following error: file = filename.replace("/", "\\").rsplit("\\", 1)[1] # find the file name IndexError: list index out of range . I just did this df.index < ref_date and I got this array([ True, True, ... False, False]). So, what could be wrong?
@Mainland With df.index < ref_date you get boolean array. Use this boolean array in np.where() to set "Before"/"After" as I have in my example.
@Mainland that doesn't make any sense. If you are getting an error, edit your question and post the full error message
I saved blary = df.index <= ref_date and np.where(blary,'Before','After'). This again gave same error. I am still surprised. No idea why.
Issue sorted. No more issues. Thanks
1

Helping you fix the list comprehension approach you took, you can use:

df['Cond'] = ['After' if x >= pd.to_datetime(ref_date) else 'Before' for x in df.index]

2 Comments

I appreciate your answer. I do not why but I am running into error file = filename.replace("/", "\\").rsplit("\\", 1)[1] # find the file name IndexError: list index out of range.
Issue sorted. No more issues. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.