1

I have a time-sampled data set with essentially a two-column index (timestamp, ID). However, some timestamps do not have a sample point for a given index.

How can I make a stackplot with Matplotlib for this kind of data?

import pandas as pd
import numpy as np
import io
import matplotlib.pyplot as plt

df = pd.read_csv(io.StringIO('''
A,B,C
1,1,0
1,2,0
1,3,0
1,4,0
2,1,.5
2,2,.2

2,4,.15
3,1,.7

3,3,.1
3,4,.2
'''.strip()))

b = np.unique(df.B)
plt.stackplot(np.unique(df.A),
              [df[df.B==_b].C for _b in b],
              labels=['B:{0}'.format(_b) for _b in b],
)
plt.xlabel('A')
plt.ylabel('C')
plt.legend(loc='upper left')
plt.show()

When I try this program, Python replies:

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

When I manually fill in the missing data points (see blank lines in string literal), the plot works fine.

enter image description here

Is there a straightforward way to "insert" zero records for missing sample data (like this question, but I have two columns functioning as indices, and I don't know how to adapt the solution to my problem) or have Matplotlib plot with holes?

1 Answer 1

2

You could use df.pivot to massage the DataFrame into a form amenable to calling DataFrame.plot(kind='area'). For example, if

In [46]: df
Out[46]: 
   A  B     C
0  1  1  0.00
1  1  2  0.00
2  1  3  0.00
3  1  4  0.00
4  2  1  0.50
5  2  2  0.20
6  2  4  0.15
7  3  1  0.70
8  3  3  0.10
9  3  4  0.20

then

In [47]: df.pivot(columns='B', index='A')
Out[47]: 
     C                
B    1    2    3     4
A                     
1  0.0  0.0  0.0  0.00
2  0.5  0.2  NaN  0.15
3  0.7  NaN  0.1  0.20

Notice that df.pivot fills in the missing NaN values for you. Now, with the DataFrame in this form,

result.plot(kind='area')

produces the desired plot.


import pandas as pd
import numpy as np
import io
import matplotlib.pyplot as plt

try:
    # for Python2
    from cStringIO import StringIO 
except ImportError:
    # for Python3
    from io import StringIO


df = pd.read_csv(StringIO('''
A,B,C
1,1,0
1,2,0
1,3,0
1,4,0
2,1,.5
2,2,.2

2,4,.15
3,1,.7

3,3,.1
3,4,.2
'''.strip()))


result = df.pivot(columns='B', index='A')
result.columns = result.columns.droplevel(0)
# Alternatively, the above two lines are equivalent to
# result = df.set_index(['A','B'])['C'].unstack('B')

ax = result.plot(kind='area')
lines, labels = ax.get_legend_handles_labels()
ax.set_ylabel('C')
ax.legend(lines, ['B:{0}'.format(b) for b in result.columns], loc='best')

plt.show()

yieldsenter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.