2

I have read an excel file as follows with Pandas, how could I plot it properly with Matplotlib?

BTW, when I read_clipboard() this format of data, it generates ParserError: Expected 4 fields in line 3, saw 5. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

enter image description here

After manually modified the excel file to the follow format:

    date  A_ratio  A_price  B_ratio  B_price
0   2007    12.00     8.90     3.04     6.35
1   2008    13.00     8.78     4.04     6.25
2   2009    14.00     9.08     5.04     6.50
3   2010    14.71     9.21     1.38     6.60
4   2011    15.71     9.22     2.38     6.66
5   2012    16.71     9.27     3.38     6.66
6   2013    16.09     9.56     1.38     6.85
7   2014    17.09     9.71     2.38     6.94
8   2015    18.09     9.31     3.38     6.65
9   2016    19.09     9.88     4.38     6.95
10  2017    20.09     9.76     5.38     6.88

I have ploted it by the following code, it works, but I don't want change it since my original data is pretty large:

df = df.set_index('date')
plt.figure(figsize=(10, 10))
cols = ['A_ratio', 'A_price', 'B_ratio', 'B_price']
df[cols].plot(kind='bar')
plt.xticks(rotation=45)
plt.xlabel("")

Output: enter image description here

Please help me, thanks.

1 Answer 1

1

I think you can use map with join for flatten MultiIndex:

df = df.set_index('date')
df.columns = df.columns.map('_'.join)

plt.figure(figsize=(10, 10))
cols = ['A_ratio', 'A_price', 'B_ratio', 'B_price']
df[cols].plot(kind='bar')
plt.xticks(rotation=45)
plt.xlabel("")

Or you can select Multiindex values by tuples:

df = df.set_index('date')

plt.figure(figsize=(10, 10))
cols = [('A','ratio'), ('A','price'), ('B','ratio'),('B','price')]
df[cols].plot(kind='bar')
plt.xticks(rotation=45)
plt.xlabel("")
Sign up to request clarification or add additional context in comments.

6 Comments

Sorry, it generates ValueError: cannot handle a non-unique multi-index! with df = df.set_index('date'), I think problem comes from multiple headers. The excel file hasn't been read correctly yet by pandas.
@ahbon - Do you try pd.read_excel(file, header=[0, 1], index_col=[0]) ?
This seems read file properly, but it generate KeyError: "None of ['date'] are in the columns" for df.set_index('date') as well.
@ahbon - If omit df = df.set_index('date') ? Because it should create date to index
If check df = pd.read_excel(file, header=[0, 1], index_col=[0]) it convert date column to index, so then use df = df.sort_index() instead df.sort_values(by = 'date')
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.