For loop parse XML with Python

Question

XML file

<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>

import xml.etree.ElementTree as et
import pandas as pd
import numpy as np

import all the libraries

tree = et.parse("documents/pythonstore.xml")

I put this file under documents

root = tree.getroot()
for a in range(3):
  for b in range(4):
     new=root[a][b].text
     print (new)

print out all the children in the XML.

df=pd.DataFrame(columns=['name','description','cost','shipping'])

created a dataframe to store all the children in XML

My questions:

How can I turn the new variable into a list? I tried append or list function, failed.
How do I use for loop to cast the children into the data frame?

Could somebody please help me! Thank you so much!

Rakesh · Accepted Answer · 2018-02-28 09:00:09Z

1

This might help.

# -*- coding: utf-8 -*-
s = """<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>"""

import xml.etree.ElementTree as et
tree = et.fromstring(s)
root = tree
res = []
for a in range(3):
    r = []
    for b in range(4):
        new=root[a][b].text
        r.append(new)
    res.append(r)

print res
df=pd.DataFrame(res, columns=['name','description','cost','shipping'])
print df

Output:

[['Python Hoodie', 'This is a Hoodie', '$49.99', '$2.00'], ['Python shirt', 'This is a shirt', '$79.99', '$4.00'], ['Python cap', 'This is a cap', '$99.99', '$3.00']]

            name       description    cost shipping
0  Python Hoodie  This is a Hoodie  $49.99    $2.00
1   Python shirt   This is a shirt  $79.99    $4.00
2     Python cap     This is a cap  $99.99    $3.00

answered Feb 28, 2018 at 9:00

Rakesh

82.9k17 gold badges85 silver badges122 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Liu Yu Over a year ago

Thank you tons! This problem has been bugging me hours.

Liu Yu Over a year ago

this is a better way, sorry about the range function. res=[] for child in root: r=[] for element in child: new=element.text r.append(new) res.append(r) print (res) df=pd.DataFrame(res, columns=['name','description','cost','shipping']) print (df)

Collectives™ on Stack Overflow

For loop parse XML with Python

XML file

import all the libraries

I put this file under documents

print out all the children in the XML.

created a dataframe to store all the children in XML

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

XML file

import all the libraries

I put this file under documents

print out all the children in the XML.

created a dataframe to store all the children in XML

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related