3

I would like to draw a stack plot with a colormap as given in Figure 5 of this paper. Here's a screenshot of the same

enter image description here

Currently, I am able to draw a scatter plot of a similar nature.

enter image description here

I would like to convert this scatter plot to a stack plot with a colormap. I am bit lost on ideas to do this. My initial guess is that for each (x,y) point I need list of z points on the colormap spectrum. I wonder however, if there's a simpler way to this. Here's my code to generate the scatter plot with color map

cm = plt.cm.get_cmap('RdYlBu')
plt.xscale('log')
plt.yscale('log')
sc = plt.scatter(x, y, c=z, marker ='x', norm = matplotlib.colors.Normalize(vmin= np.min(z), vmax=np.max(z)), s=35, cmap=cm)
plt.colorbar(sc)
plt.show()

Edit

I feel I need to find a way to convert the z-array to multiple z-arrays - one for each bin on the color bar. I can then simply create a stacked area chart from these derived z-arrays.

Edit 2

I followed Rutger's code and was able to produce this graph for my data. I wonder why there's an issue with the axes limits.

enter image description here

8
  • 2
    There are stackplot examples in the gallery: matplotlib.org/examples/pylab_examples/stackplot_demo.html But when looking at your data i doubt if you want that, a stackplot is meant for distinct series. It seems like you are more looking for a contour plot: matplotlib.org/examples/pylab_examples/contour_image.html Commented Dec 2, 2013 at 15:58
  • @RutgerKassies Can you explain why? I tried the contour plot in the example section and other sources and it isn't really reflecting the figure I posted. I feel that there are 6 bins for the z-values and I need to draw then using a stacked area chart. Commented Dec 2, 2013 at 17:56
  • 1
    @Dexter - You have scattered points with an x,y,z value for each, rather than a series of lines. A stacked plot is just a series of lines with the intervals between each of them filled. What you want (and what is shown in the figures you link to) is a contour plot of interpolated "z" values given several x,y,z observations. You'll need to interpolate your scattered data onto a regular grid and then use contourf. Commented Dec 2, 2013 at 19:57
  • 1
    Also, on a side note, in your call to scatter, you don't need to specify a custom norm if all you're doing is just a linear scale. Leaving the norm argument out entirely would have exactly the same effect as your current code. Hope that helps a bit! Commented Dec 2, 2013 at 20:02
  • 1
    @Dexter - No, it's just that you fundementally can't do what you're wanting with stackplot. (For example, notice the "island" in figure that you're trying to copy.) If you were going to try to use stackplot, you'd have to first use contour to identify lines of a constant "z" value before plotting them with stackplot. contourf basically just does this in one step, but it plots things more flexibly than stackplot ever could (for example, it will happily handle "islands": a.k.a. closed contours). Commented Dec 2, 2013 at 21:28

1 Answer 1

3

It seems from your example scatterplot that you have a lot of points. Plotting these as individual data will cover up a large portion of your data and only show the 'top' ones. This is bad practice and when you have this much data doing some aggregation will improve the visual representation.

The example below shows how you can bin and average your data by using a 2d histogram. Plotting the result as either an image or a contour is fairly straightforward once your data is in an appropriate format for visual display.

Aggregating the data before plotting also increases performance and prevents Array Too Big or memory related errors.

fig, ax = plt.subplots(1, 3, figsize=(15,5), subplot_kw={'aspect': 1})

n = 100000

x = np.random.randn(n)
y = np.random.randn(n)+5
data_values = y * x

# Normal scatter, like your example
ax[0].scatter(x, y, c=data_values, marker='x', alpha=.2)
ax[0].set_xlim(-5,5)


# Get the extent to scale the other plots in a similar fashion
xrng = list(ax[0].get_xbound())
yrng = list(ax[0].get_ybound())

# number of bins used for aggregation
n_bins = 130.

# create the histograms
counts, xedge, yedge = np.histogram2d(x, y, bins=(n_bins,n_bins), range=[xrng,yrng])
sums, xedge, yedge = np.histogram2d(x, y, bins=(n_bins,n_bins), range=[xrng,yrng], weights=data_values)

# gives a warning when a bincount is zero
data_avg = sums / counts

ax[1].imshow(data_avg.T, origin='lower', interpolation='none', extent=xrng+yrng)

xbin_size = (xrng[1] - xrng[0])  / n_bins # the range divided by n_bins
ybin_size = (yrng[1] - yrng[0])  / n_bins # the range divided by n_bins

# create x,y coordinates for the histogram
# coordinates should be shifted from edge to center
xgrid, ygrid = np.meshgrid(xedge[1:] - (xbin_size / 2) , yedge[1:] - (ybin_size / 2))

ax[2].contourf(xgrid, ygrid, data_avg.T)

ax[0].set_title('Scatter')
ax[1].set_title('2D histogram with imshow')
ax[2].set_title('2D histogram with contourf')

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for the detailed code and explanation. It really helped. I am now getting the figure (I have edited my original post) with some issues with the limits of the axes. What could possibly be the reason?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.