Populating dataframes in loop

Question

Is there an elegant way to read one file at a time, do some preprocessing, and then merge into one big dataframe. The way I do it is here. I am sure there may be some other way to get rid of variable i here.

i=0
outdf = DataFrame()
for myfile in myfiles:
    tdf = read_csv(myfile) #Read
    #Do some annotations 
    tdf['Class'] = os.path.basename(myfile).split[0]
    ..............
    #-----------------
    if i == 0:
        outdf = tdf
    else:
        outdf = concat([outdf, tdf])
    i = i +1

AFAIK you don't need i and the if clause in that loop as well. Just use outdf = concat([outdf, tdf]). In the first iteration it will do the concatenation with the empty dataframe so it will return the same dataframe. — user2285236
– user2285236, Commented May 12, 2016 at 18:28
At some point I started doing this kind of funny things. Thanks a lot. — learner
– learner, Commented May 12, 2016 at 18:31

root · Accepted Answer · 2016-05-12 18:30:38Z

2

You don't need to concatenate the DataFrames on each iteration, as concat can concatenate multiple DataFrames. Just store each individual DataFrame in a list, and concatenate at the end.

outdf = []
for myfile in myfiles:
    tdf = read_csv(myfile)
    #Do some annotations 
    tdf['Class'] = os.path.basename(myfile).split[0]
    ..............
    #-----------------
    outdf.append(tdf)

outdf = concat(outdf)

answered May 12, 2016 at 18:30

root

34.1k6 gold badges77 silver badges89 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user2285236 Over a year ago

This will also be faster.

Jarek Szymla · Accepted Answer · 2016-05-12 18:26:09Z

0

You can use enumerate.

    outdf = DataFrame()
    for i, myfile in enumerate(myfiles):
       tdf = read_csv(myfile)
       tdf['Class'] = os.path.basename(myfile).split[0]
       if i == 0:
           outdf = tdf
       else:
           outdf = concat([outdf, tdf])

answered May 12, 2016 at 18:26

Jarek Szymla

791 silver badge12 bronze badges

Collectives™ on Stack Overflow

Populating dataframes in loop

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related