1

I want to write a function that highlights some text. It takes a HTML string as input and returns HTML string with additional html tags.

Example: Input string (need to highlight the word "text"):

<div>
<a href="..." title="text to highlight">Some text to highlight</a>
<a href="..." title="text to highlight">Some other text to highlight</a>
</div>

Output string:

<div>
<a href="..." title="text to highlight">Some <b class="highlight">text</b> to highlight</a>
<a href="..." title="text to highlight">Some other <b class="highlight">text</b> to highlight</a>
</div>

I have found a regexp that matches text only between html tags, but I can't figure out how to surround some part of it with additional tags

highlight_str = u'text'
p = re.compile(r"[^<>]+(?=[<])")
    iterator = p.finditer(search_str)
    for match in iterator:
        # code for replacement here ???

Is there any other ideas to do it?

4
  • 7
    stackoverflow.com/questions/1732348/… Commented Nov 1, 2010 at 13:52
  • Seriously, the parade of programmers using regular expressions on HTML is endless. Commented Nov 1, 2010 at 14:31
  • That regex doesn't work for anything other than a rigged demo. Commented Nov 1, 2010 at 14:31
  • Ok. I've understood. It is wasn't a good idea to do it with regex. Commented Nov 1, 2010 at 20:33

1 Answer 1

4

Look at Beautiful Soup.

Sign up to request clarification or add additional context in comments.

1 Comment

Could you give a little more info how to make it with Beautiful Soup?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.