31

I have run numpy.histogram() on a bunch of subsets of a larger datasets. I want to separate the calculations from the graphical output, so I would prefer not to call matplotlib.pyplot.hist() on the data itself.

In principle, both of these functions take the same inputs: the raw data itself, before binning. The numpy version just returns the nbin+1 bin edges and nbin frequencies, whereas the matplotlib version goes on to make the plot itself.

So is there an easy way to generate the histograms from the numpy.histogram() output itself, without redoing the calculations (and having to save the inputs)?

To be clear, the numpy.histogram() output is a list of nbin+1 bin edges of nbin bins; there is no matplotlib routine which takes those as input.

2

2 Answers 2

27

You can plot the output of numpy.histogram using plt.bar.

import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)

a = np.random.rayleigh(scale=3,size=100)
bins = np.arange(10)

frq, edges = np.histogram(a, bins)

fig, ax = plt.subplots()
ax.bar(edges[:-1], frq, width=np.diff(edges), edgecolor="black", align="edge")

plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Ya know, maybe you just caught me at a bad time, and I'm down with work-arounds, but it seems weird that plt.hist cannot take in the output from np.histogram. What if you want to rescale the y-axis? np.histogram's output makes that easy, then you could plot it easy. What if you want histtype=u'step' ? now I have to figure that out in bar. All of these work-arounds would just be solved if these two things that should work together did... Anyways, +1 lol
@kηives plt.hist is a wrapper for np.histogram. Since np.histogram does not take the output of a previous call to np.histogram, the same is true for plt.hist. The question is hence: Do you want to plot something you would obtain via np.histogram? If yes, use plt.hist, if no (e.g. because you need to manipulate the data), don't use plt.hist. This makes sense, because if you manipulate the data, it's not a histogram any more, so the hist function is not responible any more. To plot a step function, use plt.step(...).
how would you write such a figure to disk?
9

New in matplotlib 3.4.0

It's no longer necessary to manually reconstruct a bar chart, as there's now a built-in method:

Use the new plt.stairs method for the common case where you know the values and edges of the steps, for instance when plotting the output of np.histogram.

Note that stairs are plotted as lines by default, so use fill=True for a solid histogram:

a = np.random.RandomState(1).rayleigh(3, size=100)

counts, edges = np.histogram(a, bins=range(10))
plt.stairs(counts, edges, fill=True)

If you want a more conventional "bar" aesthetic, combine with plt.vlines:

plt.stairs(counts, edges, fill=True)
plt.vlines(edges, 0, counts.max(), colors='w')

If you don't need the counts and edges, just unpack np.histogram directly into plt.stairs:

plt.stairs(*np.histogram(a), fill=True)

And as usual, there is an ax.stairs counterpart:

fig, ax = plt.subplots()
ax.stairs(*np.histogram(a), fill=True)

2 Comments

Is it possible to have each of the bars with its own edgecolor? (I.e., which goes in between the bars, all the way down to the axis, as in the example in the other answer.)
@AndrewJaffe I added an option with vlines, but that will only look good with white. If you want any other edgecolor (e.g., black), bar is probably still the way to go.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.