3

I want to verify that the HTML tags present in a source string are also present in a target string.

For example:

>> source = '<em>Hello</em><label>What's your name</label>'
>> verify_target(’<em>Hi</em><label>My name is Jim</label>')
True
>> verify_target('<label>My name is Jim</label><em>Hi</em>')
True
>> verify_target('<em>Hi<label>My name is Jim</label></em>')
False
1
  • match monkey and stars and deco_hand_frog using ratchet Commented Apr 20, 2010 at 6:56

2 Answers 2

4

I would get rid of Regex and look at Beautiful Soup.
findAll(True) lists all the tags found in your source.

from BeautifulSoup import BeautifulSoup 
soup = BeautifulSoup(source)
allTags = soup.findAll(True)
[tag.name for tag in allTags ]
[u'em', u'label']

then you just need to remove possible duplicates and confront your tags lists.

This snippet verifies that ALL of source's tags are present in target's tags.

from BeautifulSoup import BeautifulSoup
def get_tags_set(source):
    soup = BeautifulSoup(source)
    all_tags = soup.findAll(True)
    return set([tag.name for tag in all_tags])

def verify(tags_source_orig, tags_source_to_verify):
    return tags_source_orig == set.intersection(tags_source_orig, tags_source_to_verify)

source= '<label>What\'s your name</label><label>What\'s your name</label><em>Hello</em>'
source_to_verify= '<em>Hello</em><label>What\'s your name</label><label>What\'s your name</label>'
print verify(get_tags_set(source),get_tags_set(source_to_verify))
Sign up to request clarification or add additional context in comments.

1 Comment

Yepp. You definitely want to use BeautifulSoup.
1

I don't think that regex is the right way here, basically because html is not always just a string, but it's a bit more complex, with nested tags.

I suggest you to use HTMLParser, create a class with parses the original source and builds a structure on it. Then verify that the same data structure is valid for the targets to be verified.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.