1

I have a XML file with a structure similar to this

<records>
    <record something="this" name="ABC"/>
    <record something="this" name="DEF"/>
    <record name="ABC"  something="this"/>
    <record name="GHI"  something="this"/>
    <record something="this" name="ABC/>

What I am looking for is a Python script to bring back all unique attribute values with the attribute name of name i.e.

ABC
DEF
GHI

The script runs fine when I put the filename in myself but when it is passed through as a parameter, it falls over.

from xml.dom import minidom
import sys
print sys.argv[1]
xmldoc = minidom.parse('/root/%s.xml' % sys.argv[1])
itemlist = list(xmldoc.getElementsByTagName('record'))
itemlist.sort()
for s in itemlist :
    if s.hasAttribute("name"):
        print s.attributes['name'].value

However it is still not bring back unique values

4
  • Do you have to use minidom? Commented Apr 15, 2016 at 20:36
  • No. As long as it works. Commented Apr 15, 2016 at 20:37
  • So basically you want a set of names? Commented Apr 15, 2016 at 20:44
  • All sorted - will give answer in a moment Commented Apr 15, 2016 at 20:44

1 Answer 1

2

A simple way is to use a set and then sort:

Using lxml:

x = """<records>
<record something="this" name="ABC"/>
<record something="this" name="DEF"/>
<record name="ABC"  something="this"/>
<record name="ABC"  something="this"/>
<record name="GHI"  something="this"/>
<record noname="ijk"  something="this"/>
<record noname="lmn"  something="this"/>
<record noname="xyz"  something="this"/>
</records>"""

from lxml.etree import  fromstring
tree = fromstring(x)

print(sorted({n.get("name") for n in tree.findall(".//record[@name]")}))

Using xml:

from xml.etree import ElementTree as et

tree = et.fromstring(x)

print(sorted({n.get("name") for n in tree.findall(".//record[@name]")}))

Both give you:

['ABC', 'DEF', 'GHI']

Use parse with your own code:

from xml.etree import ElementTree as et
import sys
print sys.argv[1]
xmldoc = et.parse('/root/%s.xml' % sys.argv[1])

print(sorted({n.get("name") for n in xmldoc.findall(".//record[@name]")}))
Sign up to request clarification or add additional context in comments.

1 Comment

ImportError: cannot import name parse

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.