1

I have a datetime issue where I am trying to match up a dataframe with dates as index values.

For example, I have dr which is an array of numpy.datetime.

dr = [numpy.datetime64('2014-10-31T00:00:00.000000000'),
      numpy.datetime64('2014-11-30T00:00:00.000000000'),
      numpy.datetime64('2014-12-31T00:00:00.000000000'),
      numpy.datetime64('2015-01-31T00:00:00.000000000'),
      numpy.datetime64('2015-02-28T00:00:00.000000000'),
      numpy.datetime64('2015-03-31T00:00:00.000000000')]

Then I have dataframe with returndf with dates as index values

print(returndf) 
             1    2    3    4    5    6    7    8    9    10
10/31/2014  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
11/30/2014  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

Please ignore the missing values

Whenever I try to match date in dr and dataframe returndf, using the following code for just 1 month returndf.loc[str(dr[1])],
I get an error

KeyError: 'the label [2014-11-30T00:00:00.000000000] is not in the [index]'

I would appreciate if someone can help with me on how to convert numpy.datetime64('2014-10-31T00:00:00.000000000') into 10/31/2014 so that I can match it to the data frame index value.

Thank you,

1

1 Answer 1

0
  1. Your index for returndf is not a DatetimeIndex. Make is so:

    returndf = returndf.set_index(pd.to_datetime(returndf.index))
    
  2. Your dr is a list of Numpy datetime64 objects. That bothers me:

    dr = pd.to_datetime(dr)
    
  3. Your sample data clearly shows that the index of returndf does not include all the items in dr. In that case, use reindex

    returndf.reindex(dr)
    
                 1   2   3   4   5   6   7   8   9  10
    2014-10-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2014-11-30 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2014-12-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2015-01-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2015-02-28 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2015-03-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    
Sign up to request clarification or add additional context in comments.

2 Comments

That was very help. But I am confused about datetime a bit. Here are the steps before the variable dr data_date = pd.to_datetime(testdf["Date"], format ="%m/%d/%Y") df=sorted(data_date.unique()) data_date seems to be correct but why does using sorted and unique convert df into datetime64?
You've got several things going on in there. sorted is a python function and returns a list. Also, you have df = sorted(data_date.unique()). The unique() method will return a numpy array. Not to mention don't even know what data_date is. That wasn't in your original question. Next: you don't call pd.to_datetime on a dataframe unless that dataframe has specifically named columns and if it does have those columns, you don't need to use the format argument. If you want more clarity, post a new question. Not enough room in comments.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.