3

so I've been trying to plot a histogram using python with mathplotlib. So I've got two datasets, basically the heights of a sample of men and women as a list in python, imported off a csv file.

The code that I'm using:

import csv
import numpy as np
from matplotlib import pyplot as plt
men=[]
women=[]

with open('women.csv','r') as f:
    r1=csv.reader(f, delimiter=',')
    for row in r1:
        women+=[row[0]]

with open('men.csv','r') as f:
    r2=csv.reader(f, delimiter=',')
    for row in r2:
        men+=[row[0]]


fig = plt.figure()
ax = fig.add_subplot(111)

numBins = 20
ax.hist(men,numBins,color='blue',alpha=0.8)
ax.hist(women,numBins,color='red',alpha=0.8)
plt.show()

and the error that I get:

Traceback (most recent call last):
  File "//MEME/Users/Meme/Miniconda3/Lib/idlelib/test.py", line 22, in <module>
    ax.hist(men,numBins,color='blue',alpha=0.8)
  File "\\MEME\Users\Meme\Miniconda3\lib\site-packages\matplotlib\__init__.py", line 1811, in inner
    return func(ax, *args, **kwargs)
  File "\\MEME\Users\Meme\Miniconda3\lib\site-packages\matplotlib\axes\_axes.py", line 5983, in hist
    raise ValueError("color kwarg must have one color per dataset")
ValueError: color kwarg must have one color per dataset
2
  • 1
    Here is a tutorial: plot.ly/matplotlib/histograms Commented Aug 10, 2016 at 12:24
  • I get the same error using that code. I believe the issue has something to do with how I imported the dataset? Commented Aug 10, 2016 at 12:35

1 Answer 1

1

NOTE:assume your files contain multiple lines (comma separated) and the first entry in each line is the height.

The bug is when you append "data" into the women and men list. row[0] is actually a string. Hence matplotlib is confused. I suggest you run this code before plotting (python 2):

import csv
import numpy as np
from matplotlib import pyplot as plt
men=[]
women=[]
import pdb;
with open('women.csv','r') as f:
    r1=csv.reader(f, delimiter=',')
    for row in r1:
        women+=[(row[0])]

with open('men.csv','r') as f:
    r2=csv.reader(f, delimiter=',')
    for row in r2:
        men+=[(row[0])]


fig = plt.figure()
ax = fig.add_subplot(111)
print men
print women
#numBins = 20
#ax.hist(men,numBins,color='blue',alpha=0.8)
#ax.hist(women,numBins,color='red',alpha=0.8)
#plt.show()

A sample output will be

['1','3','3']
['2','3','1']

So in the loops, you just do a conversion from string into float or integers e.g. women += [float(row[0])] and men += [float(row[0])]

Sign up to request clarification or add additional context in comments.

2 Comments

Ah yes, it works perfect now! Thanks for the quick help!
Good to know it help! Could you please accept my answer by ticking the tick next to it? Thanks heaps. Happy coding

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.