Convert pandas multi-index to pandas timestamp

Question

I'm trying to convert an unstacked, multi-indexed data-frame back to a single pandas datetime index.

The index of my original data-frame, i.e. before multi-indexing and unstacking, looks like this:

In [1]: df1_season.index
Out [1]: 

<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-01 02:00:00, ..., 2014-07-31 23:00:00]
Length: 1472, Freq: None, Timezone: None

then I apply the multi-indexing and unstacking so I can plot the yearly data on top of each other like this:

df_sort = df1_season.groupby(lambda x: (x.year, x.month, x.day, x.hour)).agg(lambda s: s[-1])
df_sort.index = pd.MultiIndex.from_tuples(df_sort.index, names=['Y','M','D','H'])
unstacked = df_sort.unstack('Y')

My new data-frame for the first two days of May looks like this:

In [2]: unstacked
Out [2]:

          temp        season        
Y        2013  2014    2013    2014
M D  H                             
5 1  2   24.2  22.3  Summer  Summer
     8   24.1  22.3  Summer  Summer
     14  24.3  23.2  Summer  Summer
     20  24.6  23.2  Summer  Summer
  2  2   24.2  22.5  Summer  Summer
     8   24.8  22.2  Summer  Summer
     14  24.9  22.4  Summer  Summer
     20  24.9  22.8  Summer  Summer

736 rows × 4 columns

The index for the new data frame shown above now looks like this:

In [2]: unstacked.index.values[0:8]
Out [2]:

array([(5, 1, 2), (5, 1, 8), (5, 1, 14), (5, 1, 20), (5, 2, 2), (5, 2, 8), (5, 2, 14), 
       (5, 2, 20], dtype=object)

which doesn't produce a very nice plot with respect to the xticks (major and minor). If I can convert this multi-index back to a single pandas datetime index, using only the month, day and hour data, then the major/minor ticks will be plotted automagically the way I would like (I think). For example:

current solution:

xticks = (5, 1, 2), (5, 1, 8) … (5, 2, 20)

required solution:

xticks(major) = Day, Month (displayed as MAY 01, MAY 02 etc etc)
xticks(minor) = Hour (displayed as 02h 08h … 20h)

How do I go about bumping this up for some support? There are some questions on here over a year old without any answers. — roi3i3ie
– roi3i3ie, Commented Oct 31, 2014 at 18:11
Is there a reason you want to do this "auto-magically"? I would probably just write a function to custom generate the x-labels. That sounds faster than what you want. — user845888
– user845888, Commented Mar 15, 2015 at 20:29
Thanks for he reply. Maybe you're right, it's just I need to maintain a sensible scale when zooming in. I know this would be taken care of using this method. — roi3i3ie
– roi3i3ie, Commented Mar 24, 2015 at 5:34

firelynx · Accepted Answer · 2015-04-13 15:41:18Z

1

Converting data back and forth in pandas gets messy very fast, as you seem to have experienced. My recommendation in general concerning pandas and indexing, is to never just set the index, but to copy it first. Make sure you have a column which contains the index, since pandas does not allow all operations on the index, and intense setting and resetting of the index can cause columns to dissapear.

TLDR; Don't convert the index back. Keep a copy.

answered Apr 13, 2015 at 15:41

firelynx

32.5k10 gold badges94 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

firelynx Over a year ago

This also ahears to the open/closed principle: en.wikipedia.org/wiki/Open/closed_principle

EngineeredE · Accepted Answer · 2015-03-16 16:45:51Z

0

import pandas as pd
import matplotlib.pyplot as plt
from numpy.random import randn

ts = pd.Series(randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()

plt.figure()
for year in set(ts.index.year):
    tmp = ts[str(year)].values
    plt.plot(tmp, label = year)
plt.legend()
plt.show()

I think this is a better way to accomplish your goal than re-indexing. What do you think?

answered Mar 16, 2015 at 16:45

EngineeredE

7415 silver badges4 bronze badges

4 Comments

roi3i3ie Over a year ago

Hey!, thanks very much for the reply. Ok, I've just given this a go. Yes, this appears to be a much easier way of stacking/sorting yearly data on top of each other into one plot, so thanks for that. However, it isn't a solution to the question. Instead of my xticks, minor/major, being yearly coded (e.g. day month hour), they are now just broken down into arbitrary chunks of single data points, scaling from 0 to n-1, where n is the number datapoint in my measurement sample set.

EngineeredE Over a year ago

Right, I would imagine at that point it's an x_axis tick manipulation... but I am unable to figure out how exactly to do that. Could you perhaps load the data up to a csv somewhere so that I could play with it and maybe create another post on this? Would hte best term for this be a 'Seasonality Plot'- take information from multiple years and plot them on one Jan-Dec axis? I can't find any documentation on how to do this which is surprising to me

roi3i3ie Over a year ago

Hey! I've been away, sorry for the delay. Let me get back to you on this. I'll get a csv to you also. Yes to your question also. That's exactly what the plot is all about.

roi3i3ie Over a year ago

Ok, here's a link to the source files. I've included a small program too, so that you can have a real example to play with. If you launch the script, you will see two figures. Figure.1 is the requested stacked yearly data, but the horrible xticks. Figure.2 is the requested xticks but without the yearly stacking.

Solomon Vimal · Accepted Answer · 2019-01-29 18:19:36Z

0

Answered here: Pandas multi index to datetime.

df1_season.index = df1_season.index.to_frame()

answered Jan 29, 2019 at 18:19

Solomon Vimal

1,08015 silver badges31 bronze badges

Collectives™ on Stack Overflow

Convert pandas multi-index to pandas timestamp

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related