0

I'm new to Python, I hope my question isn't to silly... I want to join to pandas DataFrame (f1 and f3) and it seems that the indices are different.

f1:

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
           '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',
           '2018-01-09', '2018-01-10',
           ...
           '2018-12-22', '2018-12-23', '2018-12-24', '2018-12-25',
           '2018-12-26', '2018-12-27', '2018-12-28', '2018-12-29',
           '2018-12-30', '2018-12-31'],
          dtype='datetime64[ns]', name='date', length=365, freq=None)

f3:

Index([2018-01-01, 2018-01-02, 2018-01-07, 2018-03-30, 2018-04-01, 2018-04-02,
   2018-05-01, 2018-05-10, 2018-05-20, 2018-05-21, 2018-06-04, 2018-08-01,
   2018-12-25, 2018-12-26],
  dtype='object')

Now if I join them in order cat = [f1, f3] with
cat_total = pd.concat(cat, axis=1, sort=False) it seems to work and the correct result looks like this:

    print(cat.head())
            weekday       holidays
2018-01-01        0   Neujahrestag
2018-01-02        1  Berchtoldstag
2018-01-03        2            NaN
2018-01-04        3            NaN
2018-01-05        4            NaN

If I change to order of cat like cat = [f3, f1] it doesn't work properly...

print(cat)
                             holidays  weekday
2018-01-01               Neujahrestag        0
2018-01-02              Berchtoldstag        1
2018-01-07                  Test ZH 1        6
2018-03-30                 Karfreitag        4
2018-04-01                     Ostern        6
2018-04-02                Ostermontag        0
2018-05-01             Tag der Arbeit        1
2018-05-10                   Auffahrt        3
2018-05-20                  Pfingsten        6
2018-05-21              Pfingstmontag        0
2018-06-04                  Test ZH 2        0
2018-08-01           Nationalfeiertag        2
2018-12-25                Weihnachten        1
2018-12-26                Stephanstag        2
2018-01-01 00:00:00               NaN        0
2018-01-02 00:00:00               NaN        1
2018-01-03 00:00:00               NaN        2
2018-01-04 00:00:00               NaN        3
2018-01-05 00:00:00               NaN        4
2018-01-06 00:00:00               NaN        5
2018-01-07 00:00:00               NaN        6

Why is that like this? How can I change one of the indices of the pandas DataFrame that the formats are the same?

The f1-index arises from dates = pd.date_range(start = startdate, end = enddate, freq = 'D') and the f3-one is the result of the external package 'holidays'

I hope these are all infos needed. Thanks a lot in advance

Marco

0

1 Answer 1

1

you can change the to_datetime to format the column like so:

I assume the column is named DATE

cat_total['DATE'] = pd.to_datetime(cat_total['DATE'],format='%Y-%m-%d', errors='ignore')

to_datetime

Sign up to request clarification or add additional context in comments.

2 Comments

thanks for your answer, that helped. I had to change the index of f3 like you suggested: data.index = pd.to_datetime(data.index,format='%Y-%m-%d', errors='ignore')
No problem. Please Mark it as correct using the green tick

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.