1

I am looking into creating a big dataframe (pandas) from several individual frames. The data is organized in MF4-Files and the number of source files varies for each cycle. The goal is to have this process automated.

Creation of Dataframes:

df = (MDF('File1.mf4')).to_dataframe(channels)
df1 = (MDF('File2.mf4')).to_dataframe(channels)
df2 = (MDF('File3.mf4')).to_dataframe(channels)

These Dataframes are then merged:

df = pd.concat([df, df1, df2], axis=0)

How can I do this without dynamically creating variables for df, df1 etc.? Or is there no other way?

I have all filepathes in an Array of the form:

Filepath = ['File1.mf4', 'File2.mf4','File3.mf4',]

Now I am thinking of looping through it and create dynamically the data frames df,df1.df1000.... Any advice here?

Edit here is the full code:

df = (MDF('File1.mf4')).to_dataframe(channels)
df1 = (MDF('File2.mf4')).to_dataframe(channels)
df2 = (MDF('File3.mf4')).to_dataframe(channels)

#The Data has some offset:

x = df.index.max() 
df1.index += x 
x = df1.index.max()
df2.index += x

#With correct index now the data can be merged
df = pd.concat([df, df1, df2], axis=0)
1
  • Try creating a function which accepts a variable number of arguments. i.e. def funct(*argv):. Commented Jun 21, 2021 at 14:12

2 Answers 2

3

The way I'm interpreting your question is that you have a predefined list you want. So just:

l = []
for f in [ list ... of ... files ]:
    df = load_file(f)  # however you load it
    l.append(df)

big_df = pd.concat(l)
del l, df, f  # if you want to clean it up

You therefore don't need to manually specify variable names for your data sub-sections. If you also want to do checks or column renaming between the various files, you can also just put that into the for-loop (or alternatively, if you want to simplify to a list comprehension, into the load_file function body).

Sign up to request clarification or add additional context in comments.

3 Comments

Perfect, this has helped me a lot. How important is the last line for cleaning up? Wouldn't a garbage collection delete this automatically?
If you're running standalone sure. But if you're running it via something like Spyder or some other IDE where you can examine variables after they're created, they'll persist (so you can examine them). If you really want to release the memory, you would have to restart the kernel or explicitly mark the variables for deletion.
Perfect, this makes totally sense! Thanks again
3

Try this:

df_list = [(MDF(file)).to_dataframe(channels) for file in Filepath]
df = pd.concat(df_list)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.