I have multiple string that I want to wrap HTML tags around within an HTML document. I want to leave the text the same, but replace the strings with HTML elements containing that string.
Furthermore, some of the strings I want to replace, contain other strings I want to replace. In these cases, I want to apply the substitution of the larger string and ignore that of the smaller string.
In addition, I only want to perform this substitution when those strings are contained fully within the same element.
Here's my replacement list.
replacement_list = [
('foo', '<span title="foo" class="customclass34">foo</span>'),
('foo bar', '<span id="id21" class="customclass79">foo bar</span>')
]
Given the following html:
<html>
<body>
<p>Paragraph contains foo</p>
<p>Paragraph contains foo bar</p>
</body>
</html>
I would want to substitute to this:
<html>
<body>
<p>Paragraph contains <span title="foo" class="customclass34">foo</span></p>
<p>Paragraph contains <span id="id79" class="customclass79">foo bar</span</p>
</body>
</html>
So far I've tried using the beautiful soup library and looping through my replacement list in order of decreasing string length, and I can find and replace my strings with other strings, but I can't work out how to insert the HTML at those points. Or whether there's a better way entirely. Trying to perform string substitution with a soup.new_tag object fails whether I convert it to a string or not.
EDIT: Realised the example I gave didn't even conform to my own rules, modified example.