1

I have a set of python dictionaries that I have obtained by means of a for loop. I am trying to have these added to Pandas Dataframe.

Output for a variable called output

{'name':'Kevin','age':21}
{'name':'Steve','age':31}
{'name':'Mark','age':11}

I am trying to append each of these dictionary into a single Dataframe. I tried to perform the below but it just added the first row.

df = pd.DataFrame(output)

Could anyone advice as to where am going wrong and have all the dictionaries added to the Dataframe.

Update on the loop statement

The below code helps to read xml and convert it to a dataframe. Right now I see I am able to loop in through multiple xml files and created dictionaries for each xml file. I am trying to see how could I add each of these dictionaries to a single Dataframe:

def f(elem, result):
    result[elem.tag] = elem.text
    cs = elem.getchildren()
    for c in cs:
        result = f(c, result)
    return result

    result = {}
    for file in allFiles:
        tree = ET.parse(file)
        root = tree.getroot()
        result = f(root, result)
        print(result)
4
  • what is the type of output? print(type(output)) Commented Mar 6, 2019 at 11:02
  • @anky_91, Dataframe that stores these values from the dictionaries Commented Mar 6, 2019 at 11:03
  • Possible duplicate of Append dictionary to pandas dataframe in a loop Commented Mar 6, 2019 at 11:04
  • @anky_91 its <class 'dict'> Commented Mar 6, 2019 at 11:04

2 Answers 2

3

You can append each dictionary to list and last call DataFrame constructor:

out = []
for file in allFiles:
    tree = ET.parse(file)
    root = tree.getroot()
    result = f(root, result)
    out.append(result)
        
df = pd.DataFrame(out)
Sign up to request clarification or add additional context in comments.

7 Comments

thanks for the help. I tried the first approach you mentioned above and see that I am able to only get the header column returned back to the Dataframe df
I have problem such that in future I would be running this across different files that would have different header columns so I am not able to define what the header columns would be each time.
@codinglearner - So in lopp is not possible create list of dictionaries?
have added the loop statement, I am trying to overcome. However I am unable to add it to a Dataframe. Hope this helps.
@codinglearner - it is better, also is possible add some data sample working with your code?
|
1

We can add these dicts to a list:

ds = []
for ...:      # your loop
    ds += [d] # where d is one of the dicts

When we have the list of dicts, we can simply use pd.DataFrame on that list:

ds = [
    {'name':'Kevin','age':21},
    {'name':'Steve','age':31},
    {'name':'Mark','age':11}
]
pd.DataFrame(ds)

Output:

    name  age
0  Kevin   21
1  Steve   31
2   Mark   11

Update: And it's not a problem if different dicts have different keys, e.g.:

ds = [
    {'name':'Kevin','age':21},
    {'name':'Steve','age':31,'location': 'NY'},
    {'name':'Mark','age':11,'favorite_food': 'pizza'}
]
pd.DataFrame(ds)

Output:

   age favorite_food location   name
0   21           NaN      NaN  Kevin
1   31           NaN       NY  Steve
2   11         pizza      NaN   Mark

Update 2: Building up on our previous discussion in Python - Converting xml to csv using Python pandas we can do:

results = []
for file in glob.glob('*.xml'):
    tree = ET.parse(file)
    root = tree.getroot()
    result = f(root, {})
    result['filename'] = file # added filename to our results
    results += [result]

pd.DataFrame(results)

5 Comments

thanks for the help. I am continuing from the other ticket where you helped me build the xml to csv convertor. Since it was a new issue I raised a new ticket on this. I have multiple dictionaries from each of the xml file but now trying to add them to Dataframe
I tried to add all dicts to a list by doing the below ds = [] for i in result: ds += [i] print(ds) The above converts the dicts to a list but not dicts inside a list as shown below. Sample output: ['name':'Kevin','age':21] ['name':'Steve','age':31] ['name':'Mark','age':11]
OK, so let me clarify it a bit, on each step we get a dict like {'name': 'Kevin', ...}, right?
OK, great, can you check the code from Update 2 please?
Added filename to the results with result['filename'] = file # added filename to our results

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.