Parse specific xml comment in Python3

Question

I have a xml file containing one interesting comment, and I would like to parse it.

Here I discovered how I can handle comments, but I don't know how to use them from my main app.

#!/usr/bin/python3

import xml.etree.ElementTree as ET

with open('xml_with_comments.xml', 'w') as f:
    f.write('''<?xml version="1.0" encoding="UTF-8"?>
    <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <blah>node 1</blah>
        <!-- secret_content: Hello! -->
        <blah>node 2</blah>
        <!-- A standard comment -->
        <blah>node 3</blah>
    </root>
    ''')

class TreeBuilderWithComments(ET.TreeBuilder):
    def comment(self, data):
        if data.startswith(' secret_content: '):
            self.start(ET.Comment, {})
            self.data(data)
            self.end(ET.Comment)
            print('Secret content from TreeBuilderWithComments: ' + data[17:-1])

root = ET.parse('xml_with_comments.xml', parser=ET.XMLParser(target=TreeBuilderWithComments())).getroot()
for blah in root.findall('blah'):
    print(blah.text)

This outputs:

Secret content from TreeBuilderWithComments: Hello!
node 1
node 2
node 3

Now I would like to do something like print(root.get_secret_content()), which should print the first comment of the file begining by ' secret_content: '.

Tadhg McDonald-Jensen · Accepted Answer · 2016-08-03 16:48:25Z

1

You make an instance of TreeBuilderWithComments() inside the call to ET.parse, if you keep a reference to it you can use that instance to get the secret content:

 # do this first.
comment_handler = TreeBuilderWithComments()

root = ET.parse('xml_with_comments.xml',
                parser=ET.XMLParser(target=comment_handler)
               ).getroot()                # ^^ used here!

for blah in root.findall('blah'):
    print(blah.text)

Then you can implement .get_secret_content to your TreeBuilderWithComments class and use it on the comment_handler instance.

answered Aug 3, 2016 at 16:48

Tadhg McDonald-Jensen

21.6k5 gold badges40 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

roipoussiere Over a year ago

Thank you! I also modified TreeBuilderWithComments.comment():

def comment(self, data): [\n] SECRET_CONTENT_KEY = 'secret_content: ' [\n] data = data.strip() [\n] if data.startswith(SECRET_CONTENT_KEY): [\n] self.secret_content = data[len(SECRET_CONTENT_KEY):]

. So get_secret_content() is a simple getter.

Collectives™ on Stack Overflow

Parse specific xml comment in Python3

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related