0

I have a dataframe, df

where df =

     statistics  s_values
year
1999  cigarette use       100
1999  cellphone use       310
1999   internet use       101
1999    alcohol use       100
1999       soda use       215
2000  cigarette use       315
2000  cellphone use       317
2000   internet use       325
2000    alcohol use       108
2000       soda use       200
2001  cigarette use       122
2001  cellphone use       311
2001   internet use       112
2001    alcohol use       144
2001       soda use       689
2002  cigarette use       813
2002  cellphone use       954
2002   internet use       548
2002    alcohol use       882
2002       soda use       121

How can I go about using matplotlib to generate a plot that looks like the one I created using excel?

desired plot

2 Answers 2

2

You can use seaborn to achieve something similar:

import seaborn as sns
import matplotlib.pyplot as plt


df = df.reset_index() # use this if 'year' is the index column.
sns.lmplot(x = 'year' ,y = 's_values', hue = 'statistics', data = df,  ci=None, order=3)

OUTPUT: enter image description here

If you just want a simple line plot use:

sns.lineplot(x = 'year' ,y = 's_values', hue = 'statistics', data = df,  ci=None)

OUTPUT:

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Is there a way I could do it using only matplotlib?
@morello not without also using scipy to iterpolate values in betwen I believe.
1

You can do this with pandas only (it does use matplotlib as backend, but you don't need to write any specific code using matplotlib).

If 'year' in your df is an index and not a column, you should preface this code with df = df.reset_index().

Then:

df.pivot(index = 'year', columns = 'statistics', values = 's_values').plot()

Will give you:

enter image description here

EDIT:

As noted by @Nk03 in the comments, if your DataFrame has 3 columns and they are in order (such that your columns appear as ['a','b','c'] and you want to pivot your DataFrame as index = 'a', columns = 'b', values = 'c'), you can do df.pivot(*df).plot() to achieve the same effect.

EDIT:

As per the comments, using matplotlib specifically and not carring for smooth lines:

for stat in df['statistics'].unique():
    plt.plot(df[df['statistics'] == stat]['year'], df[df['statistics']==stat]['s_values'],label=stat)
plt.legend(title = 'statistics)

Loop over unique values in 'statistics' column, and plot each as a separate line, adding label for use in legend, then call said legend once everything is plotted.

This will give you:

enter image description here

6 Comments

I was hoping to strictly use matplotlib to generate the plot
@Morello is it a specific task to do it using matplotlib? Seeing as you already have pandas DataFrame, and pandas uses matplotlib for its plotting, this doesn't use anything else.
I was able to use pandas to achieve this but I also wanted to learn how I could generate a similar plot using only matplotlib for the future
If you were looking for a simple lineplot you could loop over statistics values and plot individual lines, however as @Nk03 mentioned above, if you want the curvy smooth lines, you would need to interpolate missing values.
If columns are in order OP can directly use - df.pivot(*df).plot(). You can add it in your answer @dm2 :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.