5

I'm trying to convert an unstacked, multi-indexed data-frame back to a single pandas datetime index.

The index of my original data-frame, i.e. before multi-indexing and unstacking, looks like this:

In [1]: df1_season.index
Out [1]: 

<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-01 02:00:00, ..., 2014-07-31 23:00:00]
Length: 1472, Freq: None, Timezone: None

then I apply the multi-indexing and unstacking so I can plot the yearly data on top of each other like this:

df_sort = df1_season.groupby(lambda x: (x.year, x.month, x.day, x.hour)).agg(lambda s: s[-1])
df_sort.index = pd.MultiIndex.from_tuples(df_sort.index, names=['Y','M','D','H'])
unstacked = df_sort.unstack('Y')

My new data-frame for the first two days of May looks like this:

In [2]: unstacked
Out [2]:

          temp        season        
Y        2013  2014    2013    2014
M D  H                             
5 1  2   24.2  22.3  Summer  Summer
     8   24.1  22.3  Summer  Summer
     14  24.3  23.2  Summer  Summer
     20  24.6  23.2  Summer  Summer
  2  2   24.2  22.5  Summer  Summer
     8   24.8  22.2  Summer  Summer
     14  24.9  22.4  Summer  Summer
     20  24.9  22.8  Summer  Summer

736 rows × 4 columns 

The index for the new data frame shown above now looks like this:

In [2]: unstacked.index.values[0:8]
Out [2]:

array([(5, 1, 2), (5, 1, 8), (5, 1, 14), (5, 1, 20), (5, 2, 2), (5, 2, 8), (5, 2, 14), 
       (5, 2, 20], dtype=object)

which doesn't produce a very nice plot with respect to the xticks (major and minor). If I can convert this multi-index back to a single pandas datetime index, using only the month, day and hour data, then the major/minor ticks will be plotted automagically the way I would like (I think). For example:

current solution:

xticks = (5, 1, 2), (5, 1, 8) … (5, 2, 20)

required solution:

xticks(major) = Day, Month (displayed as MAY 01, MAY 02 etc etc)
xticks(minor) = Hour (displayed as 02h 08h … 20h)
5
  • Even a little hint would be greatly appreciated. Commented Oct 25, 2014 at 18:28
  • How do I go about bumping this up for some support? There are some questions on here over a year old without any answers. Commented Oct 31, 2014 at 18:11
  • Another month? Anything at all will help... Commented Nov 29, 2014 at 19:05
  • Is there a reason you want to do this "auto-magically"? I would probably just write a function to custom generate the x-labels. That sounds faster than what you want. Commented Mar 15, 2015 at 20:29
  • Thanks for he reply. Maybe you're right, it's just I need to maintain a sensible scale when zooming in. I know this would be taken care of using this method. Commented Mar 24, 2015 at 5:34

3 Answers 3

1

Converting data back and forth in pandas gets messy very fast, as you seem to have experienced. My recommendation in general concerning pandas and indexing, is to never just set the index, but to copy it first. Make sure you have a column which contains the index, since pandas does not allow all operations on the index, and intense setting and resetting of the index can cause columns to dissapear.

TLDR; Don't convert the index back. Keep a copy.

Sign up to request clarification or add additional context in comments.

1 Comment

This also ahears to the open/closed principle: en.wikipedia.org/wiki/Open/closed_principle
0
import pandas as pd
import matplotlib.pyplot as plt
from numpy.random import randn

ts = pd.Series(randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()

plt.figure()
for year in set(ts.index.year):
    tmp = ts[str(year)].values
    plt.plot(tmp, label = year)
plt.legend()
plt.show()

I think this is a better way to accomplish your goal than re-indexing. What do you think?

4 Comments

Hey!, thanks very much for the reply. Ok, I've just given this a go. Yes, this appears to be a much easier way of stacking/sorting yearly data on top of each other into one plot, so thanks for that. However, it isn't a solution to the question. Instead of my xticks, minor/major, being yearly coded (e.g. day month hour), they are now just broken down into arbitrary chunks of single data points, scaling from 0 to n-1, where n is the number datapoint in my measurement sample set.
Right, I would imagine at that point it's an x_axis tick manipulation... but I am unable to figure out how exactly to do that. Could you perhaps load the data up to a csv somewhere so that I could play with it and maybe create another post on this? Would hte best term for this be a 'Seasonality Plot'- take information from multiple years and plot them on one Jan-Dec axis? I can't find any documentation on how to do this which is surprising to me
Hey! I've been away, sorry for the delay. Let me get back to you on this. I'll get a csv to you also. Yes to your question also. That's exactly what the plot is all about.
Ok, here's a link to the source files. I've included a small program too, so that you can have a real example to play with. If you launch the script, you will see two figures. Figure.1 is the requested stacked yearly data, but the horrible xticks. Figure.2 is the requested xticks but without the yearly stacking.
0

Answered here: Pandas multi index to datetime.

df1_season.index = df1_season.index.to_frame()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.