1

I am using lxml to parse a sample html. like this:

import lxml.html

__dom = lxml.html.fromstring("<html><body><div id='mydiv'></div></body></html>")

I am trying to get the id of an element that I added to the html programatically, like this:

mydiv = __dom.get_element_by_id('mydiv')
mydiv.text = "<p id='myInner'>this is the inner inner text</p>"
myInner= __dom.get_element_by_id("myInner")

When adding the P it IS added. But when trying to get it back with get_element_by_id I am getting keyError on myInner.

I am guessing that since I added the P as text - it is no parsed as an HTML element and therefore I can not get it.

So my question is really: How to add/modify the innerHTML of an element using lxml?

Thanks

1

1 Answer 1

1

as you said you are passing a string to the text attribute of div. I assume what your trying to do is to add a new P tag element as a child of the div element. You can parse your string into am lxml format then add it into the existing html as part of the tree

import lxml.html

__dom = lxml.html.fromstring("<html><body><div id='mydiv'></div></body></html>")

mydiv = __dom.get_element_by_id('mydiv')
myhtml = lxml.html.fromstring("<p id='myInner'>this is the inner inner text</p>")
mydiv.insert(0, myhtml)
print(lxml.html.tostring(__dom))

OUTPUT

<html><body><div id="mydiv"><p id="myInner">this is the inner inner text</p></div></body></html>
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, but - how can I REPLACE the ENTIRE innerHTML with something else?
Maybe you need to show your html before and how you expect it to look after as i am not sure what your trying to replace.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.