3

I have a dataframe df that contains a column of dates in a string format like '2011-12-13' and a column of time, again in a string format, like '15:40:00'.

df

index                 date        time
2011-01-03 09:40:00   2011-01-03  09:40:00 
2011-01-03 09:45:00   2011-01-03  09:45:00 
2011-01-03 09:50:00   2011-01-03  09:50:00  
2011-01-03 09:55:00   2011-01-03  09:55:00 
2011-01-03 10:00:00   2011-01-03  10:00:00  
2011-01-03 10:05:00   2011-01-03  10:05:00  

My objective is to create a colum F0 in my dataframe where F0=1 if the date belongs to any of these dates ('2011-01-26','2011-03-15', '2011-08-09', '2011-09-21', '2011-12-13') and if the time ='9:40:00'.

I am trying to use the numpy function where as follow:

dates = ['2011-01-26','2011-03-15', '2011-08-09', '2011-09-21', '2011-12-13']

df['F1'] = np.where((df.date == any(dates) & (df.time== '9:40:00'), 1, 0))

I get this error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). Why? I don't know how to use the any function correctly.

I want to create multiple columns of F2, F3, and so on for other time interval like:

df['F77'] = np.where((df.date == any(dates) & (df.time== '16:00:00'), 1, 0))

1 Answer 1

4

You don't need to use where. Just use isin and apply your condition directly to the columns:

df['F1'] = df.date.isin(dates) & (df.time=='09:40:00')
Sign up to request clarification or add additional context in comments.

1 Comment

fantastic @BrenBarn! I didn't know about isin

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.