0

I'm trying to plot the GDP per capita of various countries in a line chart using pandas and matplotlib. The Data Frame I'm working off of has a multi-index on Country and Date eg: enter image description here

This is the code I'm trying to run to produce the plot:

countries = df_no_na.index.levels[0].to_list()
dates = df_no_na.index.levels[1].to_list()
# Create Figure (empty canvas)
fig = plt.figure()
# Add set of axes to figure
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.plot(dates,df_no_na["GDP per capita"]["United States"])

I'm getting an error: ValueError: x and y must have same first dimension, but have shapes (61,) and (5,)

I think this is because df_no_na["GDP per capita"]["United States"] returns:

date
2019-01-01    65297.517508
2018-01-01    62996.471285
2017-01-01    60062.222313
2016-01-01    57951.584082
2015-01-01    56839.381774
Name: GDP per capita, dtype: float64

How can I plot, or alternatively what is the best way to plot, data from a dataframe with a multi index?

1 Answer 1

1
  • The primary issue with plotting, is getting the DataFrame into the correct shape for the plot API.
    • In this case, it is probably "easiest" to reset the index, and then plot with seaborn.lineplot. However, this is discrete, not continuous data, so it "better" to display it as a seaborn.barplot.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# test data
data = {('Aruba', '1986-01-01'): {'gdp': 6472.50202920407}, ('Aruba', '1987-01-01'): {'gdp': 7885.79654466735}, ('Aruba', '1988-01-01'): {'gdp': 9764.789978793291}, ('Aruba', '1989-01-01'): {'gdp': 11392.455810576399}, ('Aruba', '1990-01-01'): {'gdp': 12307.311737831398}, ('Aruba', '1991-01-01'): {'gdp': 13496.003142641799}, ('Aruba', '1992-01-01'): {'gdp': 14046.5037643078}, ('Aruba', '1993-01-01'): {'gdp': 14936.8272187795}, ('Aruba', '1994-01-01'): {'gdp': 16241.0465209443}, ('Aruba', '1995-01-01'): {'gdp': 16439.3563609282}, ('Aruba', '1996-01-01'): {'gdp': 16586.068435754198}, ('Aruba', '1997-01-01'): {'gdp': 17927.749635208602}, ('Aruba', '1998-01-01'): {'gdp': 19078.3431907515}, ('Aruba', '1999-01-01'): {'gdp': 19356.2033894901}, ('Aruba', '2000-01-01'): {'gdp': 20620.7006259175}, ('Aruba', '2001-01-01'): {'gdp': 20669.0319688645}, ('Aruba', '2002-01-01'): {'gdp': 20436.8871286309}, ('Aruba', '2003-01-01'): {'gdp': 20833.7616116694}, ('Aruba', '2004-01-01'): {'gdp': 22569.9749851801}, ('Aruba', '2005-01-01'): {'gdp': 23300.0395575696}, ('Aruba', '2006-01-01'): {'gdp': 24045.272483354704}, ('Aruba', '2007-01-01'): {'gdp': 25835.132667628397}, ('Aruba', '2008-01-01'): {'gdp': 27084.7036903653}, ('Aruba', '2009-01-01'): {'gdp': 24630.4537141023}, ('Aruba', '2010-01-01'): {'gdp': 23512.602595639702}, ('Aruba', '2011-01-01'): {'gdp': 24985.9932813737}, ('Aruba', '2012-01-01'): {'gdp': 24713.6980451285}, ('Aruba', '2013-01-01'): {'gdp': 26189.4355088129}, ('Aruba', '2014-01-01'): {'gdp': 26647.938100985}, ('Aruba', '2015-01-01'): {'gdp': 27980.880695275097}, ('Aruba', '2016-01-01'): {'gdp': 28281.35048163}, ('Aruba', '2017-01-01'): {'gdp': 29007.6930034887}, ('Afghanistan', '1960-01-01'): {'gdp': 59.7731938409853}, ('Afghanistan', '1961-01-01'): {'gdp': 59.8608738790779}, ('Afghanistan', '1962-01-01'): {'gdp': 58.458014949543895}, ('Afghanistan', '1963-01-01'): {'gdp': 78.7063875407802}, ('Afghanistan', '1964-01-01'): {'gdp': 82.0952307131832}, ('Afghanistan', '1965-01-01'): {'gdp': 101.10830485337699}, ('Afghanistan', '1966-01-01'): {'gdp': 137.594352053111}, ('Afghanistan', '1967-01-01'): {'gdp': 160.89858887243798}, ('Afghanistan', '1968-01-01'): {'gdp': 129.108323102596}, ('Afghanistan', '1969-01-01'): {'gdp': 129.329712876621}, ('Afghanistan', '1970-01-01'): {'gdp': 156.518939442982}, ('Afghanistan', '1971-01-01'): {'gdp': 159.567578521888}, ('Afghanistan', '1972-01-01'): {'gdp': 135.31730831433}, ('Afghanistan', '1973-01-01'): {'gdp': 143.14464950008102}, ('Afghanistan', '1974-01-01'): {'gdp': 173.653764639169}, ('Afghanistan', '1975-01-01'): {'gdp': 186.510897140201}, ('Afghanistan', '1976-01-01'): {'gdp': 197.44550755114497}, ('Afghanistan', '1977-01-01'): {'gdp': 224.224797281134}, ('Afghanistan', '1978-01-01'): {'gdp': 247.354106347038}, ('Afghanistan', '1979-01-01'): {'gdp': 275.738197619262}, ('Afghanistan', '1980-01-01'): {'gdp': 272.65528565023203}, ('Afghanistan', '1981-01-01'): {'gdp': 264.11131745306096}, ('Afghanistan', '2002-01-01'): {'gdp': 179.426610967229}, ('Afghanistan', '2003-01-01'): {'gdp': 190.683814295088}, ('Afghanistan', '2004-01-01'): {'gdp': 211.382116942655}, ('Afghanistan', '2005-01-01'): {'gdp': 242.031284871985}, ('Afghanistan', '2006-01-01'): {'gdp': 263.733691663044}, ('Afghanistan', '2007-01-01'): {'gdp': 359.69323750139506}, ('Afghanistan', '2008-01-01'): {'gdp': 364.6607447985}, ('Afghanistan', '2009-01-01'): {'gdp': 438.076034406941}, ('Afghanistan', '2010-01-01'): {'gdp': 543.303041863931}, ('Afghanistan', '2011-01-01'): {'gdp': 591.162759035926}, ('Afghanistan', '2012-01-01'): {'gdp': 641.8714791575389}, ('Afghanistan', '2013-01-01'): {'gdp': 637.165523187024}, ('Afghanistan', '2014-01-01'): {'gdp': 613.856689167623}, ('Afghanistan', '2015-01-01'): {'gdp': 578.466352941708}, ('Afghanistan', '2016-01-01'): {'gdp': 547.228110150363}, ('Afghanistan', '2017-01-01'): {'gdp': 556.30200240406}, ('Afghanistan', '2018-01-01'): {'gdp': 524.162880925404}, ('Afghanistan', '2019-01-01'): {'gdp': 502.115486913067}, ('Angola', '1980-01-01'): {'gdp': 710.981648140027}, ('Angola', '1981-01-01'): {'gdp': 642.383857952257}, ('Angola', '1982-01-01'): {'gdp': 619.9613575311099}, ('Angola', '1983-01-01'): {'gdp': 623.440584831564}, ('Angola', '1984-01-01'): {'gdp': 637.715230700475}, ('Angola', '1985-01-01'): {'gdp': 758.237576171151}, ('Angola', '1986-01-01'): {'gdp': 685.270085316704}, ('Angola', '1987-01-01'): {'gdp': 756.2618530274119}, ('Angola', '1988-01-01'): {'gdp': 792.3031202186439}, ('Angola', '1989-01-01'): {'gdp': 890.5541364590081}, ('Angola', '1990-01-01'): {'gdp': 947.7041820853709}, ('Angola', '1991-01-01'): {'gdp': 865.69272959239}, ('Angola', '1992-01-01'): {'gdp': 656.361755960006}, ('Angola', '1993-01-01'): {'gdp': 441.200673252825}, ('Angola', '1994-01-01'): {'gdp': 328.673294707808}, ('Angola', '1995-01-01'): {'gdp': 397.17945076947194}, ('Angola', '1996-01-01'): {'gdp': 522.643807265256}, ('Angola', '1997-01-01'): {'gdp': 514.295223223424}, ('Angola', '1998-01-01'): {'gdp': 423.59366023208}, ('Angola', '1999-01-01'): {'gdp': 387.784316047502}, ('Angola', '2000-01-01'): {'gdp': 556.836318086553}, ('Angola', '2001-01-01'): {'gdp': 527.333528536691}, ('Angola', '2002-01-01'): {'gdp': 872.4944915928411}, ('Angola', '2003-01-01'): {'gdp': 982.960899291112}, ('Angola', '2004-01-01'): {'gdp': 1255.5640447149099}, ('Angola', '2005-01-01'): {'gdp': 1902.42234554625}, ('Angola', '2006-01-01'): {'gdp': 2599.56646397608}, ('Angola', '2007-01-01'): {'gdp': 3121.99563726236}, ('Angola', '2008-01-01'): {'gdp': 4080.94140992346}, ('Angola', '2009-01-01'): {'gdp': 3122.78076649385}, ('Angola', '2010-01-01'): {'gdp': 3587.88379824396}, ('Angola', '2011-01-01'): {'gdp': 4615.46802807906}, ('Angola', '2012-01-01'): {'gdp': 5100.095808097671}, ('Angola', '2013-01-01'): {'gdp': 5254.8823379961605}, ('Angola', '2014-01-01'): {'gdp': 5408.41049555432}, ('Angola', '2015-01-01'): {'gdp': 4166.97968386501}, ('Angola', '2016-01-01'): {'gdp': 3506.07288506966}, ('Angola', '2017-01-01'): {'gdp': 4095.8129415585704}, ('Angola', '2018-01-01'): {'gdp': 3289.64666408633}, ('Angola', '2019-01-01'): {'gdp': 2973.5911597986797}}

# setup dataframe
df = pd.DataFrame.from_dict(data, orient='index')
df.index.set_names(['country', 'date'], inplace=True)

# display(df.head())
                              gdp
country     date                 
Afghanistan 1960-01-01  59.773194
            1961-01-01  59.860874
            1962-01-01  58.458015
            1963-01-01  78.706388
            1964-01-01  82.095231

# reset the index
df.reset_index(inplace=True)

# set the date column to a datetime format
df.date = pd.to_datetime(df.date).dt.date

# sort values
df.sort_values(['date', 'country'], inplace=True)

# display(df.head())
       country        date        gdp
0  Afghanistan  1960-01-01  59.773194
1  Afghanistan  1961-01-01  59.860874
2  Afghanistan  1962-01-01  58.458015
3  Afghanistan  1963-01-01  78.706388
4  Afghanistan  1964-01-01  82.095231

# plot the date with seaborn.lineplot
plt.figure(figsize=(10, 8))
sns.lineplot(x='date', y='gdp', hue='country', data=df)
plt.yscale('log')
plt.legend(title='country', bbox_to_anchor=(1.05, 1), loc='upper left')

enter image description here

seaborn.barplot

plt.figure(figsize=(8, 15))
sns.barplot(x='gdp', y='date', data=df, orient='h', hue='country')
plt.xscale('log')
plt.legend(title='country', bbox_to_anchor=(1.05, 1), loc='upper left')

enter image description here

seaborn.catplot

sns.catplot(data=df, x='date', y='gdp', col='country', col_wrap=2, kind='bar', height=2.5, aspect=4).set_xticklabels(rotation=90)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.