0

I'm trying to populate a lot of templated html documents with html strings contained in a json. For example, my html might look like:

<div class="replace_this_div">
<div>
<p>this text</p>
<p>should be replaced</p>
</div>
</div>

The replacement is in string form and would look something like:

"<p>My replacement code might have standard paragraphs, <a href="fake_link">links</a>, or other html elements such as lists.</p>"

Afterwards, it should simply look like this:

<div class="replace_this_div">
"<p>My replacement code might have standard paragraphs, <a href="fake_link">links</a>, or other html elements such as lists.</p>"
</div>

I've messed around a bit in BeautifulSoup trying to accomplish this. The problem I'm having is that even though I simply want to replace everything inside the designated div, I can't figure out how to do so using my string which is already formatted as html (especially with how beautifulsoup uses tags).

Does anybody have any insight on how to do this? Thanks!

1
  • What have you tried with BeautifulSoup thus far? Sounds like you're going in the right direction but missing a step somewhere... if you can show that... someone might be able to fill in the blanks. Commented Jan 12, 2019 at 13:46

1 Answer 1

2

You can use clear() to clear the contents of the tag. Then create a BeautifulSoup object out of your string by calling the constructor. Then add inside the original tag using append().

from bs4 import BeautifulSoup
html="""
<div class="replace_this_div">
<div>
<p>this text</p>
<p>should be replaced</p>
</div>
</div>
"""
new_content=u'<p>My replacement code might have standard paragraphs, <a href="fake_link">links</a>, or other html elements such as lists.</p>'
soup=BeautifulSoup(html,'html.parser')
outer_div=soup.find('div',attrs={"class":"replace_this_div"})
outer_div.clear()
outer_div.append(BeautifulSoup(new_content,'html.parser'))
print(soup.prettify())

Output

<div class="replace_this_div">
<p>
 My replacement code might have standard paragraphs,
 <a href="fake_link">
  links
 </a>
 , or other html elements such as lists.
</p>
</div>
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for this! I'm still new to actually writing with bs4 rather than just extracting and the solution I'd worked out with replace_with was very hacky and didn't work at all if there was no inner div.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.