Is there an elegant way to read one file at a time, do some preprocessing, and then merge into one big dataframe.
The way I do it is here. I am sure there may be some other way to get rid of variable i here.
i=0
outdf = DataFrame()
for myfile in myfiles:
tdf = read_csv(myfile) #Read
#Do some annotations
tdf['Class'] = os.path.basename(myfile).split[0]
..............
#-----------------
if i == 0:
outdf = tdf
else:
outdf = concat([outdf, tdf])
i = i +1
outdf = concat([outdf, tdf]). In the first iteration it will do the concatenation with the empty dataframe so it will return the same dataframe.