1

I have 2 csv files in a folder which look like this:

(file1)

Count      Bins
0       -0.322392
1       -0.319392
1       -0.316392
0       -0.313392
2       -0.310392
1       -0.307392
5       -0.304392
4       -0.301392

(file 2)

Count      Bins
5       -0.322392
1       -0.319392
1       -0.316392
6       -0.313392
2       -0.310392
1       -0.307392
2       -0.304392
4       -0.301392

and I want to make a line graph with the Bins on the x-axis vs. the Count on the y-axis. So there would only be one line in each graph. I am using this code so far:

import pandas as pd
import os
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

#path where csv files are stored
pth = (r'F:\Sheyenne\Statistics\IDL_stats\NDII-2')

#initiate loop
for f in os.listdir(pth):
    if not os.path.isfile(os.path.join(pth,f)):
        continue
    #read each file
    df = pd.read_csv(os.path.join(pth, f))
    #add column names
    df.columns=['Count', 'Bins']
    #create pdf file to save graphs to
    with PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Delete.pdf') as pdf:
         #plot the graph
         df2=df.plot(title=str(f))
         #set x-label
         df2.set_xlabel("Bins")
         #set y-label
         df2.set_ylabel("Count")
         #save the figure
         pdf.savefig(df2)
         #close the figure
         plt.close(df2)
print "Done Processing"  

But this graphs two lines, one for Count and one for Bins. It also only graphs the first file and not the second returning the error:

Traceback (most recent call last):

  File "<ipython-input-5-b86bf00675fa>", line 1, in <module>
    runfile('F:/python codes/IDL_histograms.py', wdir='F:/python codes')

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
    execfile(filename, namespace)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "F:/python codes/IDL_histograms.py", line 26, in <module>
    pdf.savefig(df2)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\matplotlib\backends\backend_pdf.py", line 2438, in savefig
    raise ValueError("No such figure: " + repr(figure))

ValueError: No such figure: <matplotlib.axes._subplots.AxesSubplot object at 0x0D628FB0>
2
  • Please post full traceback. Commented Oct 26, 2015 at 20:39
  • I added an edit for you. Commented Oct 26, 2015 at 20:41

2 Answers 2

7

Pandas DataFrame.plot() returns a matplotlib axis object, but savefig needs a fig object. Get the current matplotlib figure with plt.gcf() and save that.

# Open the pdf before looping to add pages
with PdfPages(r'C:\test\Delete.pdf') as pdf:
    for f in os.listdir(pth):
        if not os.path.isfile(os.path.join(pth,f)):
            continue
        # ignore the pdf file that just got created
        if 'pdf' in f:
            continue
        #read each file
        df = pd.read_csv(os.path.join(pth, f))
        #add column names
        df.columns=['Count', 'Bins']
        #create pdf file to save graphs to
        #plot the graph
        df2=df.plot(title=str(f))
        #set x-label
        df2.set_xlabel("Bins")
        #set y-label
        df2.set_ylabel("Count")
        #save the figure
        fig = plt.gcf()
        pdf.savefig(fig)
        #close the figure
        plt.close(fig)

Works for me.

Sign up to request clarification or add additional context in comments.

6 Comments

that saves the figures great, I am not sure why there was a problem with only one graph being returned, I changed the file directory and it seems to work fine now...
Actually, this will print both figures out just fine, but only the first figure is saved to the pdf still...
I don't understand. If your complaint is that the output file only has one figure, the cause is that you have the "with PdfPages()..." within the for loop, so it reopen the file at each loop and only save one. In that case, you need to put the "with PdfPages()" outside of the for loop.
Yes, that is what is going wrong for me now, the output file only contains the first figure
See new code on original post. Also, I suggest Mad Physicist's comment on properly specifying x and y axes.
|
0

Instead of df2=df.plot(title=str(f)), which plots everything in your dataframe separately, try df2=df.plot(x='Bins', y='Count', title=str(f))

1 Comment

That's true, but it doesn't help with his problem. I can't seem to be able to replicate it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.