Having trouble plotting 2-D histogram with numpy.histogram2d and matplotlib

Question

I have a massive data set where I need to split my plot into a grid and count the number of points within each grid square. I'm following a method outlined here:

with a stripped-down version of my code below:

import numpy as np
import matplotlib.pyplot as plt

x = [ 1.83259571, 1.76278254, 1.38753676, 1.6406095, 1.34390352, 1.23045712, 1.85877565, 1.26536371, 0.97738438]

y = [ 0.04363323, 0.05235988, 0.09599311, 0.10471976, 0.1134464, 0.13962634, 0.17453293, 0.20943951, 0.23561945]

gridx = np.linspace(min(x),max(x),11)
gridy = np.linspace(min(y),max(y),11)

grid, _, _ = np.histogram2d(x, y, bins=[gridx, gridy])

plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)

plt.figure()
plt.pcolormesh(gridx, gridy, grid)
plt.plot(x, y, 'ro')
plt.colorbar()

plt.show()

Where the problem arises is the grid is identifying elements of the plot as where points are appearing yet there are no points within some of those elements; similarly, where some of the actual data points appear the grid does not recognize them as not actually being there.

What might be causing this problem? Also, sorry for not attaching the plot, I'm a new user and my reputation isn't high enough.

UPDATE Here's a code that generates 100 random points and attempts to plot them in a 2-D histogram:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.rand(100)

y = np.random.rand(100)

gridx = np.linspace(0,1,11)
gridy = np.linspace(0,1,11)

grid, __, __ = np.histogram2d(x, y, bins=[gridx, gridy])

plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)

plt.figure()
plt.pcolormesh(gridx, gridy, grid)
plt.plot(x, y, 'ro')
plt.colorbar()

plt.show()

Yet when I run it I have the same problem as before: the locations of the points and the colors corresponding to point-location-density don't agree. Does this happen when anyone runs this code for themselves?

SECOND UPDATE

And at the risk of beating a dead horse, here's a code for a parametric plot:

import numpy as np
import matplotlib.pyplot as plt

t = np.linspace(0,1,100)
x = np.sin(t)
y = np.cos(t)

gridx = np.linspace(0,1,11)
gridy = np.linspace(0,1,11)

#grid, __, __ = np.histogram2d(x, y, bins=[gridx, gridy])
grid, __, __ = np.histogram2d(x, y)

plt.figure()
plt.plot(x, y, 'ro')
plt.grid(True)

plt.figure()
plt.pcolormesh(gridx, gridy, grid)
plt.plot(x, y, 'ro')
plt.colorbar()

plt.show()

which makes me think this is all some kind of weird scaling issue. Still totally lost though...

the reason the above does not work maybe due to the lack of data points in your data. you seem to have only 9 data points for x and y. Whereas the example you follow has 100 data points, try the same example with 9 points and it does not work!! — Srivatsan
– Srivatsan, Commented Jun 2, 2014 at 21:13
Could np.histogram2d have a problem with randomly scattered small numbers? I tried what you suggested with 100 points but it still didn't work. Strangely enough, when I tried a test case of x and y equal to linspace(0,1,100), the colormesh function worked perfectly. — user3555455
– user3555455, Commented Jun 2, 2014 at 22:02

paisanco · Accepted Answer · 2014-06-18 05:35:02Z

I was able to get your example to work by using imshow with interpolation instead of pcolormesh. See sample code below.

I think the problem may be that pcolormesh has a different origin convention than plot. The results of pcolormesh look like the upper left and lower right are flipped.

The result with imshow looks like:

imshow result

The sample code:

import numpy as np
import matplotlib.pyplot as plt

def doPlot():

    x = [ 1.83259571, 1.76278254, 1.38753676, 1.6406095, 1.34390352, 1.23045712, 1.85877565, 1.26536371, 0.97738438]

    y = [ 0.04363323, 0.05235988, 0.09599311, 0.10471976, 0.1134464, 0.13962634, 0.17453293, 0.20943951, 0.23561945]

    gridx = np.linspace(min(x),max(x),11)
    gridy = np.linspace(min(y),max(y),11)

    H, xedges, yedges = np.histogram2d(x, y, bins=[gridx, gridy])

    plt.figure()
    plt.plot(x, y, 'ro')
    plt.grid(True)

    #wrong origin convention for pcolormesh?
    #plt.figure()
    #plt.pcolormesh(gridx, gridy, H)
    #plt.plot(x, y, 'ro')
    #plt.colorbar()


    plt.figure()
    myextent  =[xedges[0],xedges[-1],yedges[0],yedges[-1]]
    plt.imshow(H.T,origin='low',extent=myextent,interpolation='nearest',aspect='auto')
    plt.plot(x,y,'ro')
    plt.colorbar()

    plt.show()

if __name__=="__main__":
    doPlot()

Kyle Bogdan · Accepted Answer · 2015-03-27 18:39:25Z

0

Referencing numpy histogram2d's documentation...

The careful reader will note that the parameters are backwards.

histogram2d(y, x, bins=(xedges, yedges)

Compute the bi-dimensional histogram of two data samples.

Parameters

x : array_like, shape (N,) An array containing the x coordinates of the points to be histogrammed.

y : array_like, shape (N,) An array containing the y coordinates of the points to be histogrammed.

Ergo, you supplied your x's to the y parameter of the function and vice versa for the x's.

Regards

answered Mar 27, 2015 at 18:39

Kyle Bogdan

1

1 Comment

kadrlica Over a year ago

This is not true in numpy 1.13: docs.scipy.org/doc/numpy-1.13.0/reference/generated/…

Collectives™ on Stack Overflow

Having trouble plotting 2-D histogram with numpy.histogram2d and matplotlib

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related