0

I have 14 .csv files (1 .csv file per location) that will be used to make a 14 bar plots of daily rainfall. The following code is an example of what one bar plot will look like.

import numpy as np
import pandas as pd 
from datetime import datetime, time, date
import matplotlib.pyplot as plt

# Import data
dat = pd.read_csv('a.csv')
df0 = dat.loc[:, ['TimeStamp', 'RF']]

# Change time format
df0["time"] = pd.to_datetime(df0["TimeStamp"])
df0["day"] = df0['time'].map(lambda x: x.day)
df0["month"] = df0['time'].map(lambda x: x.month)
df0["year"] = df0['time'].map(lambda x: x.year)
df0.to_csv("a2.csv", na_rep="0")  # write to csv

# Combine for daily rainfall
df1 = pd.read_csv('a2.csv', encoding='latin-1',
              usecols=['day', 'month', 'year', 'RF', 'TimeStamp'])
df2 = df1.groupby(['day', 'month', 'year'], as_index=False).sum()
df2.to_csv("a3.csv", na_rep="0", header=None)  # write to csv

# parse date
df3 = pd.read_csv("a3.csv", header=None, index_col='datetime', 
             parse_dates={'datetime': [1,2,3]}, 
             date_parser=lambda x: pd.datetime.strptime(x, '%d %m %Y'))

def dt_parse(date_string):
dt = pd.datetime.strptime(date_string, '%d %m %Y')
return dt

# sort datetime
df4 = df3.sort()
final = df4.reset_index()

# rename columns
final.columns = ['date', 'bleh', 'rf']

final[['date','rf']].plot()

plt.suptitle('Rain 2015-2016', fontsize=20)
plt.xlabel('Date', fontsize=18)
plt.ylabel('Rain / mm', fontsize=16)
plt.savefig('a.jpg')
plt.show()

And the final plot looks like this: enter image description here

How can I automate this code (i.e. write a for-loop perhaps?) so that I don't have to re-type the code for each .csv file? It would be nice if the code also saves the figure with the name of the .csv as the name of the .jpg file.

The names of the 14 files are as such: names = ["a.csv","b.csv", "c.csv","d.csv","e.csv","f.csv"...]

Here's an example of the type of file that I'm working with: https://dl.dropboxusercontent.com/u/45095175/test.csv

1 Answer 1

1

First method: you need to put all your csv files in the current folder. You also need to use the os module.

import os
for f in os.listdir('.'):                 # loop through all the files in your current folder
    if f.endswith('.csv'):                # find csv files
        fn, fext = os.path.splitext(f)    # split file name and extension

        dat = pd.read_csv(f)              # import data
        # Run the rest of your code here

        plt.savefig('{}.jpg'.format(fn))  # name the figure with the same file name 

Second method: if you don't want to use the os module, you can put your file names in a list like this:

files = ['a.csv', 'b.csv']

for f in files:
    fn = f.split('.')[0]

    dat = pd.read_csv(f)
    # Run the rest of your code here

    plt.savefig('{}.jpg'.format(fn))
Sign up to request clarification or add additional context in comments.

7 Comments

I actually already do this. The first line in my code is: os.chdir('/Users/me/desktop')
@JAG2024 Have you tried the rest of the code? Does it work?
Almost! I just get the error message Traceback (most recent call last): File "run.py", line 57, in <module> plt.savefig('{}.jpg'.format(fn)) # name the figure with the same file name NameError: name 'fn' is not defined
I copied and pasted your "First method" code and added my code where you said # Run the rest of your code here.
Is there also a way to include the name of the csv file in the renaming of the new csv files (e.g. change a3 in df2.to_csv("a3.csv", na_rep="0", header=None) to the name of the csv file used as well as in plt.suptitle('Rain 2015-2016', fontsize=20)?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.