1

I have yearly data files in different folders. each file contains daily data ranging from Jan 1 to Dec 31. Data files name is looks like AS060419.67 where last four digit represent year i.e. 1967 and 0604 is folder name.

I tried to read these multiple files by using the code (below), but it reads only for last year data in last folder

def date_parser(doy, year):    
    return dt.datetime.strptime(doy.zfill(3)+year, '%j%Y')

files = glob.glob('????/AS*')
files.sort()
files
STNS = {}
for f in files:
    stn_id, info = f.split('/')
    year = "".join(info[-5:].split('.'))
    #print (f,stn_id)
    with open(f) as fo:                  
        data = fo.readlines()[:-1]
        data = [d.strip() for d in data]
        data = '\n'.join(data)
        with open('data.dump', 'w') as dump:
            dump.write(data)

parser = lambda date: date_parser(date, year=year)
df = pd.read_table('data.dump', delim_whitespace=True,names=['date','prec'], 
                   na_values='DNA', parse_dates=[0], date_parser=parser, index_col='date' ) 

df.replace({'T': 0})
df = df.apply(pd.to_numeric, args=('coerce',))
df.name = stn_name
df.sid = stn_id

if stn_id not in STNS.keys():
    STNS[stn_name] = df

else:
    STNS[stn_id] = STNS[stn_id].append(df)
    STNS[stn_id].name = df.name
    STNS[stn_id].sid = df.sid
    #outfile.write(line)

For making plot

for stn in STNS:
    STNS[stn_id].plot()
    plt.title('Precipitation for {0}'.format(STNS[stn].name))

The problem is it reads only last year data in last folder. Can anyone help to figure out this problem.Your help will be highly appreciated.

8
  • Sounds like you want os.walk Commented Mar 26, 2016 at 9:19
  • could you help me please :( Commented Mar 26, 2016 at 9:22
  • 1
    You are overwriting the output data with open('data.dump', 'w'). You should probably be opening that file in 'a' mode.Take a look at the accepted answer to python open built-in function: difference between modes a, a+, w, w+, and r+? for info about file modes. Commented Mar 26, 2016 at 9:24
  • @bikuser, can you post a sample of you input files in original format (as text) - 5-7 rows would be enough. I have a feeling that you don't need to loop through your files ... Commented Mar 26, 2016 at 9:51
  • 1
    @bikuser, i've added an answer - please check it Commented Mar 26, 2016 at 12:20

2 Answers 2

2

You can do it like this:

import os
import glob
import pandas as pd
import matplotlib.pyplot as plt

# file mask
fmask = r'./data/????/AS*.??'

# all RegEx replacements
replacements = {
  r'T': 0
}

# list of data files
flist = glob.glob(fmask)


def read_data(flist, date_col='date', **kwargs):
    dfs = []
    for f in flist:
        # parse year from the file name
        y = os.path.basename(f).replace('.', '')[-4:]
        df = pd.read_table(f, **kwargs)
        # replace day of year with a date
        df[date_col] = pd.to_datetime(y + df[date_col].astype(str).str.zfill(3), format='%Y%j')
        dfs.append(df)
    return pd.concat(dfs, ignore_index=True)


df = read_data(flist,
               date_col='date',
               sep=r'\s+',
               header=None,
               names=['date','prec'],
               engine='python',
               skipfooter=1,
              ) \
     .replace(replacements, regex=True) \
     .set_index('date') \
     .apply(pd.to_numeric, args=('coerce',))


df.plot()

plt.show()

I've downloaded only four files, so the corresponding data you can see on the plot...

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Hi @MaxU, Thank you very much for your kind support :)
@bikuser, glad i could help :)
1

You overwrite the same file over and over again. Derive your target file name from your source file name. Or use the append mode if you want it all in the same file.

How do you append to a file?

1 Comment

hi Jacques, how to use append mode in with open function? could you help me please?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.