0

I have a function (function_from_xml_pddataframe) that takes xml files from data folder and transform to pandas dataframe called df_xml.

After this I need to create only one pandas dataframe (all_dfs) by merging them all by row.

This is what I did so far with a for loop:

import os
all_dfs = pd.DataFrame()

for file in tqdm("/data"):

    if file.endswith(".xml"):
      function_from_xml_pddataframe(xmlfile)
      
      df_created = df_xml
3
  • 2
    Create a list of dataframes by appending df_xml to the list inside the for loop, then pd.concat that list after the for loop. Commented Mar 29, 2022 at 14:27
  • @ScottBoston, thanks for your time. This is exactly the idea. How I append df_xml to the list inside the loop? Commented Mar 29, 2022 at 14:32
  • 1
    Well it looks like @TitouanL has posted this solution. Commented Mar 29, 2022 at 14:33

1 Answer 1

2

Considering that all your DataFrames have the same columns, you can build a list in the for loop.

import os
list_of_dataframes = []

for file in tqdm("/data"):

    if file.endswith(".xml"):
        df_xml = function_from_xml_pddataframe(xmlfile)
        list_of_dataframes.append(df_xml)
      
all_dfs = pd.concat(list_of_dataframes)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.