0

I am trying to parse an XML using ElementTree and get all the required fields.

Problem : My list is getting empty , condition that i am trying is- If reference('type') == 'cve' then i want to get 'id' text in reference tag.

Can someone suggest/correct me in getting the required field?

My Actual code is below:

import xml.etree.ElementTree as ET

file_name = "updateinfo.xml"
parser = ET.XMLParser(encoding="utf-8")
tree = ET.parse(file_name, parser=parser)
tree_toString = (ET.tostring(tree.getroot()))
for ele in tree.findall('update'):
    cveList = [
        ele.find('references/reference').get('id') if ele.find('references/reference').get('type') == 'cve' else None
        for cve in ele.find('references/reference')]
    print cveList

My XML structure is below :

<?xml version="1.0" encoding="UTF-8"?>
<updates>
        <update status="final" from="[email protected]" version="4" type="enhancement" >
            <id>RHEA-2017:2259</id>
            <issued date="2017-08-01 05:59:34 UTC" />
            <title>new packages: usbguard</title>
            <release>0</release>
            <rights>Copyright 2017 Red Hat Inc</rights>
            <pushcount>4</pushcount>
            <updated date="2017-08-01 05:59:34 UTC" />
            <references>
                <reference href="https://access.redhat.com/errata/RHEA-2017:2259" type="self" id="RHEA-2017:2259" title="RHEA-2017:2259" />
                <reference href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.4_Release_Notes/index.html" type="other" id="ref_0" title="other_reference_0" />
            </references>
            <pkglist>
                <collection short="" >
                    <name>rhel-7-server-rpms__7_DOT_4__x86_64</name>
                    <package src="usbguard-0.7.0-3.el7.src.rpm" name="usbguard" epoch="0" version="0.7.0" release="3.el7" arch="i686" >
                        <filename>usbguard-0.7.0-3.el7.i686.rpm</filename>
                        <sum type="sha256" >efd5ca6dd3df02e8537cf45cef48508bf023f568a98ce9f28e9baf77c5caac6c</sum>
                    </package>
                    <package src="usbguard-0.7.0-3.el7.src.rpm" name="usbguard" epoch="0" version="0.7.0" release="3.el7" arch="x86_64" >
                        <filename>usbguard-0.7.0-3.el7.x86_64.rpm</filename>
                        <sum type="sha256" >3f72768880085d6bfff37636d3a8eb54184e5619353b5efbefd5738e74bdfa08</sum>
                    </package>
                </collection>
            </pkglist>
        </update>
        <update status="final" from="[email protected]" version="1" type="bugfix" >
            <id>RHBA-2014:0722</id>
            <issued date="2014-06-10 00:00:00" />
            <title>kexec-tools bug fix update</title>
            <rights>Copyright 2014 Red Hat Inc</rights>
            <pushcount>1</pushcount>
            <updated date="2014-06-10 00:00:00" />
            <references>
                <reference href="https://rhn.redhat.com/errata/RHBA-2014-0722.html" type="self" title="RHBA-2014:0722" />
            </references>
            <pkglist>
                <collection short="" >
                    <name>rhel-7-server-rpms__7_DOT_4__x86_64</name>
                    <package src="kexec-tools-2.0.4-32.el7_0.1.src.rpm" name="kexec-tools" epoch="0" version="2.0.4" release="32.el7_0.1" arch="x86_64" >
                        <filename>kexec-tools-2.0.4-32.el7_0.1.x86_64.rpm</filename>
                        <sum type="sha256" >8e214681104e4ba73726e0ce11d21b963ec0390fd70458d439ddc72372082034</sum>
                    </package>
                </collection>
            </pkglist>
        </update>
        <update status="final" from="[email protected]" version="4" type="security" >
            <id>RHSA-2017:2831</id>
            <issued date="2017-09-28 18:56:55 UTC" />
            <title>Critical: firefox security update</title>
            <release>0</release>
            <rights>Copyright 2017 Red Hat Inc</rights>
            <severity>Critical</severity>
            <pushcount>4</pushcount>
            <updated date="2017-09-28 18:56:56 UTC" />
            <references>
                <reference href="https://access.redhat.com/errata/RHSA-2017:2831" type="self" id="RHSA-2017:2831" title="RHSA-2017:2831" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496649" type="bugzilla" id="1496649" title="CVE-2017-7793 Mozilla: Use-after-free with Fetch API (MFSA 2017-22)" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496651" type="bugzilla" id="1496651" title="CVE-2017-7810 Mozilla: Memory safety bugs fixed in Firefox 56 and Firefox ESR 52.4 (MFSA 2017-22)" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496652" type="bugzilla" id="1496652" title="CVE-2017-7814 Mozilla: Blob and data URLs bypass phishing and malware protection warnings (MFSA 2017-22)" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496653" type="bugzilla" id="1496653" title="CVE-2017-7818 Mozilla: Use-after-free during ARIA array manipulation (MFSA 2017-22)" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496654" type="bugzilla" id="1496654" title="CVE-2017-7819 Mozilla: Use-after-free while resizing images in design mode (MFSA 2017-22)" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496655" type="bugzilla" id="1496655" title="CVE-2017-7823 Mozilla: CSP sandbox directive did not create a unique origin (MFSA 2017-22)" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1496656" type="bugzilla" id="1496656" title="CVE-2017-7824 Mozilla: Buffer overflow when drawing and validating elements with ANGLE (MFSA 2017-22)" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7793.html" type="cve" id="CVE-2017-7793" title="CVE-2017-7793" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7810.html" type="cve" id="CVE-2017-7810" title="CVE-2017-7810" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7814.html" type="cve" id="CVE-2017-7814" title="CVE-2017-7814" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7818.html" type="cve" id="CVE-2017-7818" title="CVE-2017-7818" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7819.html" type="cve" id="CVE-2017-7819" title="CVE-2017-7819" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7823.html" type="cve" id="CVE-2017-7823" title="CVE-2017-7823" />
                <reference href="https://www.redhat.com/security/data/cve/CVE-2017-7824.html" type="cve" id="CVE-2017-7824" title="CVE-2017-7824" />
                <reference href="https://access.redhat.com/security/updates/classification/#critical" type="other" id="classification" title="critical" />
                <reference href="https://www.mozilla.org/en-US/security/advisories/mfsa2017-22/" type="other" id="ref_0" title="other_reference_0" />
            </references>
            <pkglist>
                <collection short="" >
                    <name>rhel-7-server-rpms__7_DOT_4__x86_64</name>
                    <package src="firefox-52.4.0-1.el7_4.src.rpm" name="firefox" epoch="0" version="52.4.0" release="1.el7_4" arch="x86_64" >
                        <filename>firefox-52.4.0-1.el7_4.x86_64.rpm</filename>
                        <sum type="sha256" >7b81b37bf969534bee0152bc13db56ae410eee06120a78d8da261c10c73c0514</sum>
                    </package>
                </collection>
            </pkglist>
        </update>
        <update status="final" from="[email protected]" version="2" type="bugfix" >
            <id>RHBA-2016:2423</id>
            <issued date="2016-11-03 06:09:21 UTC" />
            <title>oscap-anaconda-addon bug fix update</title>
            <release>0</release>
            <rights>Copyright 2016 Red Hat Inc</rights>
            <severity>None</severity>
            <pushcount>2</pushcount>
            <updated date="2016-11-03 06:10:44 UTC" />
            <references>
                <reference href="https://access.redhat.com/errata/RHBA-2016:2423" type="self" id="RHBA-2016:2423" title="RHBA-2016:2423" />
                <reference href="https://bugzilla.redhat.com/show_bug.cgi?id=1269211" type="bugzilla" id="1269211" title="could move security section down to bottom since it's not as important as network spoke" />
            </references>
            <pkglist>
                <collection short="" >
                    <name>rhel-7-server-rpms__7_DOT_4__x86_64</name>
                    <package src="oscap-anaconda-addon-0.7-12.el7.src.rpm" name="oscap-anaconda-addon" epoch="0" version="0.7" release="12.el7" arch="noarch" >
                        <filename>oscap-anaconda-addon-0.7-12.el7.noarch.rpm</filename>
                        <sum type="sha256" >507fbf46ddaed0bb4087d3ef2b31db235473f3be36aaa9ed7df43279ed7e2f07</sum>
                    </package>
                </collection>
            </pkglist>
        </update>

3 Answers 3

1

Question: How do I check condition during XML parsing


What you are doing, is not parsing, as this line has done the parsing already:

tree = ET.parse(file_name, parser=parser)

You don't need to pass parser=XMLParser, as this is the standard Parser.
Read for Reference: xml.etree.ElementTree.parse

Your example code loops the ElementTree FOUR TIMES.

for ele in tree.findall('update'):
    cveList = [
        ele.find('references/reference').get('id') if ele.find('references/reference').get('type') == 'cve' else None
        for cve in ele.find('references/reference')]

Every .find..., will loop until it findes the requested Element or up to the End.
You should avoid such nested coding!

You can get all reference Elements by one loop, for example:

import xml.etree.ElementTree as ET

file_name = "test/updateinfo.xml"
tree = ET.parse(file_name)

cveList = []
for reference in tree.findall('update/references/reference'):
    if reference.attrib.get('type') == 'cve':
        cveList.append(reference.attrib.get('id'))

print(cveList)

Output:

['CVE-2017-7793', 'CVE-2017-7810', 'CVE-2017-7814', 'CVE-2017-7818', 'CVE-2017-7819', 'CVE-2017-7823', 'CVE-2017-7824']

Comment: cveList for each update item instead of getting all items in one list.I would like to iterate in each update and get other attributes as well

# Findall 'update' Elements in tree
for update in tree.findall('update'):
    # Findall 'references/reference' in update
    for reference in update.findall('references/reference'):
        if reference.attrib.get('type') == 'cve':
            # Find Element with tag <title> in update
            title = update.find('title').text
            # Append a Dict with keys 'title' and 'id'
            cveList.append({'title': title, 'id': reference.get('id')})

Output:

[{'id': 'CVE-2017-7793', 'title': 'Critical: firefox security update'}, {'id': 'CVE-2017-7810', 'title': 'Critical: firefox security update'}, {'id': 'CVE-2017-7814', 'title': 'Critical: firefox security update'}, {'id': 'CVE-2017-7818', 'title': 'Critical: firefox security update'}, {'id': 'CVE-2017-7819', 'title': 'Critical: firefox security update'}, {'id': 'CVE-2017-7823', 'title': 'Critical: firefox security update'}, {'id': 'CVE-2017-7824', 'title': 'Critical: firefox security update'}]

Tested with Python:2.7.9

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you @stovfl
can you please help me with cveList for each update item instead of getting all items in one list.I would like to iterate in each update and get other attributes as well.My final list will contain id,title,cvelist .thats the reason i am using for ele in tree.findall('update')
Awesome..Thanks
1
<?xml version="1.0" encoding="UTF-8"?>
<computer>
<extension_attributes>
    <extension_attribute>
        <id>8</id>
        <name>user1</name>
        <type>String</type>
        <multi_value>false</multi_value>
        <value>Installed</value>
    </extension_attribute>
    <extension_attribute>
        <id>33</id>
        <name>user2</name>
        <type>String</type>
        <multi_value>false</multi_value>
        <value>Not Installed</value>
    </extension_attribute>
</extension_attributes>
import requests
import xml.etree.cElementTree

get_url = "<https://some.url.com/extension_attributes>"
headers = {'Accept': 'application/xml', 'Content-Type': 
'application/xml', 'authorization': 'Basic xxxxx'} 
r = requests.get(get_url, headers=headers)

root = xml.etree.ElementTree.fromstring(r.text)
values = root.findall('extension_attributes/extension_attribute')
for val in values:
    if val.find('id').text == '33':
       print('Value', val.find('value').text)

Comments

0

Using ele.find(...).get(‘id’) isn’t right - use cve.find(‘id’) And instead of ele.find(...).get(‘type’) use cve.get(‘type’)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.