0

I want to seach within a string for heading tags,I'm looking for a regular expression to find the index within the document wherever a heading tag occurs, So something like:

str.index('<h*>')

Where the * would represent only 1 character ie. 1 , 2 , 3 etc. eliminating any head tags or html tags

Any help would be much appreciated.

1
  • You could use <h\d> or <h\d[^>]+> if you want to match <hn...> (eg: it has other attributes. Commented Aug 31, 2011 at 14:50

2 Answers 2

1
import re

matches = re.finditer('<h[1-6]>', your_text)
for match in matches:
    print match.start()
Sign up to request clarification or add additional context in comments.

Comments

0

The regex you need is this:

<h.>

This will match <h1>, <h2>, <hr>, etc... If you only want to match heading tags, use:

<h\d>

1 Comment

Horizontal rules are not headings.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.