3

In my xml file, I have nodes like this:

<waitingJobs idList="J03ac2db8 J03ac2fb0"/>

I know how to use .findall to search for attributes but now, it looks like I would need to use regular expressions because I can't just use

root.findall('./[@attrib='value']')

I'd have to use

root.findall('./[@attrib='*value*']')

QUESTION

  1. Is this possible with xml.etree?
  2. How do you do this with lxml?

1 Answer 1

2

Unfortunately, things like contains() and starts-with() are not supported by xml.etree.ElementTree built-in library. You can manually check the attribute, finding all waitingJobs and using .attrib to get to the idList value:

import xml.etree.ElementTree as ET

data = """<jobs>
    <waitingJobs idList="J03ac2db8 J03ac2fb0"/>
</jobs>
"""

root = ET.fromstring(data)
value = 'J03ac2db8'
print([elm for elm in root.findall(".//waitingJobs[@idList]") 
       if value in elm.attrib["idList"]])

With lxml.etree, you can use xpath() method and contains() function:

import lxml.etree as ET

data = """<jobs>
    <waitingJobs idList="J03ac2db8 J03ac2fb0"/>
</jobs>
"""

root = ET.fromstring(data)

value = 'J03ac2db8'
print(root.xpath(".//waitingJobs[contains(@idList, '%s')]" % value))
Sign up to request clarification or add additional context in comments.

2 Comments

sigh. I guess it's finally time to move to lxml. For how sucky xml.etree is, why is it still included in Python? Why isn't lxml the default????
@Adrian well, this is definitely sad, but the beauty of Python is also that huge variety of the third-parties available on the PyPI. A little about why the xpath support is limited: stackoverflow.com/questions/10982557/…. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.