7

I'm trying to plot a time-series of histograms in Python. There has been a similar question about this, but in R. So, basically, I need the same thing, but I'm really bad in R. There are usually 48 values per day in my dataset. Where - 9999 represents missing data. Here's the sample of the data.

I started with reading in the data and constructing a pandas DataFrame.

import pandas as pd
df = pd.read_csv('sample.csv', parse_dates=True, index_col=0, na_values='-9999') 
print df

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 336 entries, 2008-07-25 14:00:00 to 2008-08-01 13:30:00
Data columns (total 1 columns):
159.487691046    330  non-null values
dtypes: float64(1)

Now I can group the data by day:

daily = df.groupby(lambda x: x.date())

But then I'm stuck. I don't know how to use this with matplotlib to get my timeseries of histograms. Any help appreciated, not necessarily using pandas.

1 Answer 1

5

Make a histogram and use matplotlib's pcolor.

We need to bin the groups uniformly, so we make bins manually based on the range of your sample data.

In [26]: bins = np.linspace(0, 360, 10)

Apply histogram to each group.

In [27]: f = lambda x: Series(np.histogram(x, bins=bins)[0], index=bins[:-1])

In [28]: df1 = daily.apply(f)

In [29]: df1
Out[29]: 
            0    40   80   120  160  200  240  280  320
2008-07-25    0    0    0    3   18    0    0    0    0
2008-07-26    2    0    0    0   17    6   13    1    8
2008-07-27    4    3   10    0    0    0    0    0   31
2008-07-28    0    7   15    0    0    0    0    6   20
2008-07-29    0    0    0    0    0    0   20   26    0
2008-07-30   10    1    0    0    0    0    1   25    9
2008-07-31   30    4    1    0    0    0    0    0   12
2008-08-01    0    0    0    0    0    0    0   14   14

Following your linked example in R, the horizontal axis should be dates, and the vertical axis should be the range of bins. The histogram values are a "heat map."

In [30]: pcolor(df1.T)
Out[30]: <matplotlib.collections.PolyCollection at 0xbb60e2c>

enter image description here

It remains to label the axes. This answer should be of some help.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! This should do it. I totally forgot to mention that - 9999 is a missing number and should be disarded. Will add it to the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.