0

Hi all so I'm trying to work with this set of data that has two columns, one is names and the other is the number of births for each name. What I want to do is import a csv file, perform some basic functions on it such as finding the baby name with the maximum number of births, and then plotting the data in a bar graph. But, when I have an index value for the dataframe, the bar graph prints that as the x axis instead of the names. So I removed the index and now I get all kinds of errors. Below is my code, first the one with the index and then the one without. Thanks in advance. This is really driving me crazy

import pandas as pd
import matplotlib.pyplot as plt
import pdb
import matplotlib as p
import os
from pandas import DataFrame
Location = os.path.join(os.path.sep,'Users', 'Mark\'s Computer','Desktop','projects','data','births1880.csv')
a = pd.read_csv(Location, index_col = False)
print(a) #print the dataframe just to see what I'm getting.
MaxValue = a['Births'].max()
MaxName = a['Names'][a['Births'] == MaxValue].values
print(MaxValue, ' ', MaxName)
a.plot(kind ='bar')
plt.show()

This code works but spits out a bar graph with the index as the x axis instead of the names?

import pandas as pd
import matplotlib.pyplot as plt
import pdb
import matplotlib as p
import os
from pandas import DataFrame
Location = os.path.join(os.path.sep,'Users', 'Mark\'s Computer','Desktop','projects','data','births1880.csv')
a = pd.read_csv(Location, index_col = True) #why is setting the index column to true removing it?
print(a) #print the dataframe just to see what I'm getting.
MaxValue = a['Births'].max()
MaxName = a['Names'][a['Births'] == MaxValue].values
print(MaxValue, ' ', MaxName)
a.plot(kind ='bar', x='Names', y = 'Births' )
plt.show()

edited for solution.

3
  • index_col is not supposed to be a boolean, but the column(s) you want as index. I recommend reading the read_csv docs for whatever arguments you use. Also there is too much going on in this question, is it really "how to plot a DataFrame with one column as the x-axis?", if so best to provide a simple DataFrame which demonstrates the issue! Commented Sep 5, 2014 at 2:11
  • Part of problem is that once you set a column as an index, you cannot continue to treat is as a column. Commented Sep 5, 2014 at 2:12
  • figured it out, and it turned out to be really, really simple. I just had to add an x ='Names' and y ='Births'. Commented Sep 5, 2014 at 14:54

1 Answer 1

2

It would be nice if you'd provided a sample csv file, so I made one up, took me a while to figure out what format pandas expects.

I used a test.csv that looked like:

names,briths
mike,3
mark,4

Then my python code:

import pandas
import numpy
import matplotlib.pyplot as plt

a = pandas.read_csv('test.csv', index_col = False)
a.plot(kind='bar')
indices = numpy.arange(len(a['names']))
plt.xticks( indices+0.5, a['names'].values)
plt.show()

result

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the answer, and sorry about not including the cvs file, I cut the code that I used to make mine out of the post and forgot about it. I'll remember it for next time! Anyway, for whatever reason your code doesn't work for me, it keeps spitting back errors. I'm guessing this is because I'm using python 3. On the brightside, I figured it out myself after toying around with the parameters of the plot function I found adding x='Names and y='Births' fixes my problem. Thanks anyway for the effort, I really appreciate it.
Yeah, I'm on python 2.7. I'm glad you figured it out!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.