0

I'm relatively new to python. I've been trying to learn python through a hands-on approach (I learnt c/c++ through the doing the euler project). Right now I'm learning how to extract data from files. I've gotten the hang of extracting data from simple text files but I'm kinda stuck on xml files. An example of what I was trying to do. I have my call logs backed up on google drive and they're a lot (about 4000) Here is the xml file example

<call number="+91234567890" duration="49" date="1483514046018" type="3" presentation="1" readable_date="04-Jan-2017 12:44:06 PM" contact_name="Dad" />

I want to take all the calls to my dad and display them like this

number = 234567890
duration = "49"  date="04-Jan-2017 12:44:06 PM"
duration = "x"   date="y"
duration = "n"   date="z"

and so on like that. How do you propose I do that?

2
  • How do you propose I do that? I propose that you try writing some code, so we can see that you've put some effort into this. Commented Jan 18, 2018 at 3:44
  • Well, you can probably start by showing us a MVCE of what you've tried so far first. Commented Jan 18, 2018 at 3:44

1 Answer 1

1

It's advisable to provide sufficient information in a question so that problem can be recreated.

<?xml version="1.0" encoding="UTF-8"?>
<call number="+91234567890" duration="49" date="1483514046018" type="3" 
 presentation="1" readable_date="04-Jan-2017 12:44:06 PM" 
    contact_name="Dad" />

First we need to figure out what elements can we iter on. Since <call ../> is root element over here, we iter over that.

NOTE: if you have tags/element prior to the line provided, you will need to figure out proper root element instead of call.

>>> [i for i in root.iter('call')]
[<Element 'call' at 0x29d3410>]

Here you can see, we can iter on element call.

Then we simply iter over the element and separate out element attribute key and values as per requirements.

Working Code

import xml.etree.ElementTree as ET
data_file = 'test.xml'
tree = ET.parse(data_file)
root = tree.getroot()

for i in root.iter('call'):
    print 'duration', "=", i.attrib['duration']
    print 'data', "=", i.attrib['date']

Result

>>> 
duration = 49
data = 1483514046018
>>> 
Sign up to request clarification or add additional context in comments.

3 Comments

A XML file needs at least one root element for parsing I wondered about that. The <call .../> element can't be the root?
@John Gordon. You are right. I did try without <data> element as root and it works. The <call> element in this case works as root. The documentation I referred needed a root element to iter. I've updated answer. Thanks for catching that.
@Anil_M Thank you so much for this. Now I've understood how to extract data from xml files. If I had just read this doc link I would have got it. Thanks a bunch.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.