Differences between seaborn histogram, countplot and distplot

Question

I think they all look the same but there must be some difference.

They all take a single column as input, and the y-axis has the count for all plots.

Trenton McKinney · Accepted Answer · 2022-07-05 15:36:45Z

43

Those plotting functions pyplot.hist, seaborn.countplot and seaborn.displot are all helper tools to plot the frequency of a single variable. Depending on the nature of this variable they might be more or less suitable for visualization.

Continuous variable

A continuous variable x may be histrogrammed to show the frequency distribution.

import matplotlib.pyplot as plt
import numpy as np

x = np.random.rand(100)*100
hist, edges = np.histogram(x, bins=np.arange(0,101,10))
plt.bar(edges[:-1], hist, align="edge", ec="k", width=np.diff(edges))

plt.show()

The same can be achieved using pyplot.hist or seaborn.distplot,

plt.hist(x, bins=np.arange(0,101,10), ec="k")

or

sns.distplot(x, bins=np.arange(0,101,10), kde=False, hist_kws=dict(ec="k"))

distplot^1. wraps pyplot.hist, but has some other features in addition that allow to e.g. show a kernel density estimate.

Discrete variable

For a discrete variable, a histogram may or may not be suitable. If you use a numpy.histogram, the bins would need to be exactly inbetween the expected discrete observations.

x1 = np.random.randint(1,11,100)

hist, edges = np.histogram(x1, bins=np.arange(1,12)-0.5)
plt.bar(edges[:-1], hist, align="edge", ec="k", width=np.diff(edges))
plt.xticks(np.arange(1,11))

One could instead also count the unique elements in x,

u, counts = np.unique(x1, return_counts=True)
plt.bar(u, counts, align="center", ec="k", width=1)
plt.xticks(u)

resulting in the same plot as above. The main difference is for the case where not every possible observation is occupied. Say 5 is not even part of your data. A histogram approach would still show it, while it's not part of the unique elements.

x2 = np.random.choice([1,2,3,4,6,7,8,9,10], size=100)

plt.subplot(1,2,1)
plt.title("histogram")
hist, edges = np.histogram(x2, bins=np.arange(1,12)-0.5)
plt.bar(edges[:-1], hist, align="edge", ec="k", width=np.diff(edges))
plt.xticks(np.arange(1,11))

plt.subplot(1,2,2)
plt.title("counts")
u, counts = np.unique(x2, return_counts=True)
plt.bar(u.astype(str), counts, align="center", ec="k", width=1)

The latter is what seaborn.countplot does.

sns.countplot(x2, color="C0")

It is hence suitable for discrete or categorical variables.

Summary

All functions pyplot.hist, seaborn.countplot and seaborn.displot act as wrappers for a matplotlib bar plot and may be used if manually plotting such bar plot is considered too cumbersome.
For continuous variables, a pyplot.hist or seaborn.distplot may be used. For discrete variables, a seaborn.countplot is more convenient.

^{1. Note sns.distplot is deprecated since seaborn 0.11.2. For figure-level plots, use sns.displot, and for axes-level plots, use sns.histplot.}

edited Jul 5, 2022 at 15:36

Trenton McKinney

63.2k41 gold badges169 silver badges212 bronze badges

answered Jan 22, 2019 at 13:46

ImportanceOfBeingErnest

342k61 gold badges737 silver badges771 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user2340939 Over a year ago

Actually, I think both continuous and discrete variables, hence numerical variables, should be presented with pyplot.hist or seaborn.displot, and seaborn.countplot should be used for categorical variables. From the seaborn documentation of seaborn.countplot: "Show the counts of observations in each categorical bin using bars... A count plot can be thought of as a histogram across a categorical, instead of quantitative, variable". See seaborn.pydata.org/generated/seaborn.countplot.html.

Salih Over a year ago

What's the difference between figure-level plots and axis-level plots

Collectives™ on Stack Overflow

Differences between seaborn histogram, countplot and distplot

1 Answer 1

Continuous variable

Discrete variable

Summary

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Continuous variable

Discrete variable

Summary

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related