Plot multi-level headers dataframe with Matplotlib

Question

I have read an excel file as follows with Pandas, how could I plot it properly with Matplotlib?

BTW, when I read_clipboard() this format of data, it generates ParserError: Expected 4 fields in line 3, saw 5. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

After manually modified the excel file to the follow format:

    date  A_ratio  A_price  B_ratio  B_price
0   2007    12.00     8.90     3.04     6.35
1   2008    13.00     8.78     4.04     6.25
2   2009    14.00     9.08     5.04     6.50
3   2010    14.71     9.21     1.38     6.60
4   2011    15.71     9.22     2.38     6.66
5   2012    16.71     9.27     3.38     6.66
6   2013    16.09     9.56     1.38     6.85
7   2014    17.09     9.71     2.38     6.94
8   2015    18.09     9.31     3.38     6.65
9   2016    19.09     9.88     4.38     6.95
10  2017    20.09     9.76     5.38     6.88

I have ploted it by the following code, it works, but I don't want change it since my original data is pretty large:

df = df.set_index('date')
plt.figure(figsize=(10, 10))
cols = ['A_ratio', 'A_price', 'B_ratio', 'B_price']
df[cols].plot(kind='bar')
plt.xticks(rotation=45)
plt.xlabel("")

Output:

Please help me, thanks.

jezrael · Accepted Answer · 2020-01-17 12:33:58Z

1

I think you can use map with join for flatten MultiIndex:

df = df.set_index('date')
df.columns = df.columns.map('_'.join)

plt.figure(figsize=(10, 10))
cols = ['A_ratio', 'A_price', 'B_ratio', 'B_price']
df[cols].plot(kind='bar')
plt.xticks(rotation=45)
plt.xlabel("")

Or you can select Multiindex values by tuples:

df = df.set_index('date')

plt.figure(figsize=(10, 10))
cols = [('A','ratio'), ('A','price'), ('B','ratio'),('B','price')]
df[cols].plot(kind='bar')
plt.xticks(rotation=45)
plt.xlabel("")

answered Jan 17, 2020 at 12:33

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

ah bon Over a year ago

Sorry, it generates ValueError: cannot handle a non-unique multi-index! with df = df.set_index('date'), I think problem comes from multiple headers. The excel file hasn't been read correctly yet by pandas.

jezrael Over a year ago

@ahbon - Do you try pd.read_excel(file, header=[0, 1], index_col=[0]) ?

ah bon Over a year ago

This seems read file properly, but it generate KeyError: "None of ['date'] are in the columns" for df.set_index('date') as well.

jezrael Over a year ago

@ahbon - If omit df = df.set_index('date') ? Because it should create date to index

jezrael Over a year ago

If check df = pd.read_excel(file, header=[0, 1], index_col=[0]) it convert date column to index, so then use df = df.sort_index() instead df.sort_values(by = 'date')

|

Collectives™ on Stack Overflow

Plot multi-level headers dataframe with Matplotlib

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related