I am trying to analyze xml data, and encountered an issue with regard to HTML entities when I use
import xml.etree.ElementTree as ET
tree = ET.parse(my_xml_file)
root = tree.getroot()
for regex_rule in root.findall('.//regex_rule'):
print(regex_rule.get('input')) #this ".get()" method turns < into <, but I want to get < as written
print(regex_rule.get('input') == "(?<!\S)hello(?!\S)") #prints out false because ElementTree's get method turns < into < , is that right?
And here is the xml file contents:
<rules>
<regex_rule input="(?<!\S)hello(?!\S)" output="world"/>
</rules>
I would appreciate if anybody can direct me to getting the string as is from the xml attribute for the input, without converting
<
into
<