1

I need to parse both var & group root elements.

Code

import xml.etree.ElementTree as ET
tree_ownCloud = ET.parse('0020-syslog_rules.xml')
root = tree_ownCloud.getroot()

Error

xml.etree.ElementTree.ParseError: junk after document element: line 17, column 0

Sample XML

<var name="BAD_WORDS">core_dumped|failure|error|attack| bad |illegal |denied|refused|unauthorized|fatal|failed|Segmentation Fault|Corrupted</var>

<group name="syslog,errors,">
  <rule id="1001" level="2">
    <match>^Couldn't open /etc/securetty</match>
    <description>File missing. Root access unrestricted.</description>
    <group>pci_dss_10.2.4,gpg13_4.1,</group>
  </rule>

  <rule id="1002" level="2">
    <match>$BAD_WORDS</match>
    <options>alert_by_email</options>
    <description>Unknown problem somewhere in the system.</description>
    <group>gpg13_4.3,</group>
  </rule>
</group>

I tried following couple of other questions on stackoverflow here, but none helped.

I know the reason, due to which it is not getting parsed, people have usually tried hacks. IMO it's a very common usecase to have multiple root elements in XML, and something must be there in ET parsing library to get this done.

2
  • 1
    Well. "IMO it's a very common usecase to have multiple root elements in XML," - this is not true. By definition of XML, it always has exactly one root element. Commented Dec 15, 2017 at 8:23
  • Ok, didn't knew thanks Commented Dec 15, 2017 at 8:23

2 Answers 2

5

As mentioned in the comment, an XML file cannot have multiple roots. Simple as that.

If you do receive/store data in this format (and then it's not proper XML). You could consider a hack of surrounding what you have with a fake tag, e.g.

import xml.etree.ElementTree as ET

with open("0020-syslog_rules.xml", "r") as inputFile: 
  fileContent = inputFile.read()
  root = ET.fromstring("<fake>" + fileContent +"</fake>")
  print(root)
Sign up to request clarification or add additional context in comments.

Comments

3

Actually, the example data is not a well-formed XML document, but it is a well-formed XML entity. Some XML parsers have an option to accept an entity rather than a document, and in XPath 3.1 you can parse this using the parse-xml-fragment() function.

Another way to parse a fragment like this is to create a wrapper document which references it as an external entity:

<!DOCTYPE wrapper [
<!ENTITY e SYSTEM "fragment.xml">
]>
<wrapper>&e;</wrapper>

and then supply this wrapper document as the input to your XML parser.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.