3

I am trying to plot some data effectively so I can visualise it but I am having some trouble. I have two values. One is discrete (0 or 1) and called label. The other is a continuous value anywhere between 0 and 1. I wish to create a histogram, where on the X axis there would be numerous bars, for example one for every .25 of data, so four bars, where the first has the value of 0-0.25, the second 0.25-0.5, third 0.5-0.75 and fourth 0.75-1.

The y axis will then be split up by whether label is a 1 or a 0, so we end up with a graph like this :

Please excuse the poor paint image!

If there is any effective, intelligent ways to split up my data (rather than just having four bars hardcoded for these values) I would be interested in this too, though that probably warrants another question. I will post it when I have code from this running.

I have both values stored in numpy arrays as follows, but am unsure how to plot a graphs like this :

import numpy as np
import pylab as P

variable_values = trainData.get_vector('variable') #returns one dimensional numpy array of vals
label_values = trainData.get_vector('label')
x = alchemy_category_score_values[alchemy_category_score_values != '?'].astype(float) #removing void vals
y = label_values[alchemy_category_score_values != '?'].astype(float)

fig = plt.figure()

plt.title("Feature breakdown histogram")
plt.xlabel("Variable")
plt.xlim(0, 1)
plt.ylabel("Label")
plt.ylim(0, 1)
xvals = np.linspace(0,1,.02)

plt.show()

The matplotlib tutorial shows the following code to roughly achieve what I want, but I can't really understand how it works (LINK) :

P.figure()

n, bins, patches = P.hist(x, 10, normed=1, histtype='bar', stacked=True)

P.show()

Any help is greatly appreciated. Thank you.

Edit :

I am now getting the error :

AssertionError: incompatible sizes: argument 'height' must be length 5 or scalar

I have printed my two numpy arrays and they are of equal length, one is discrete, the other continuous. Here is the code I am running :

x = variable_values[variable_values != '?'].astype(float)
y = label_values[label_values != '?'].astype(float)

print x #printing numpy arrays of equal size, x is continuous, y is discrete. Both of type float now.
print y

N = 5
ind = np.arange(N)    # the x locations for the groups
width = 0.45       # the width of the bars: can also be len(x) sequence

p1 = plt.bar(ind, y,   width, color='r') #error occurs here
p2 = plt.bar(ind, x, width, color='y',
             bottom=x)

plt.ylabel('Scores')
plt.title('Scores by group and gender')
plt.xticks(ind+width/2., ('G1', 'G2', 'G3', 'G4', 'G5') )
plt.yticks(np.arange(0,81,10))
plt.legend( (p1[0], p2[0]), ('Men', 'Women') )

plt.show()
2
  • Your x values must be a 2d array. Did you notice the command x = mu + sigma*P.randn(1000,3) in the link you gave? This is used to make the three stacked bars. Commented Feb 13, 2014 at 19:14
  • The error comes from the N variable, which is the number of bars in the histogram. Either write a 4, or use len(x). Commented Feb 13, 2014 at 21:11

1 Answer 1

2

I think this other tutorial from the same Matplotlib gallery will be much more revealing to you ...

Notice that the second series of data has an extra argument in the call: bottom

p1 = plt.bar(ind, menMeans,   width, color='r', yerr=womenStd)
p2 = plt.bar(ind, womenMeans, width, color='y',
             bottom=menMeans, yerr=menStd)

Just replace menMeans with x and womenMeans with y.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the response. What would I put for yerr in this case, I dont quite understand how that works :)
yerr can be omitted. It allows you to put an error range on top of each of the histogram bars. It is optional.
Many thanks for the help. I have updated my question above. I think I am close now but cannot figure out this error message, can you see anything I have done incorrectly here?
See my comment on the question: fix the value of N

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.