1

Is there a way to check if a tag is a self-closing tag with HTMLparser?

I know self-closing tags are handled by the built-in function: handle_startendtag()

However, it only handles them if they are explicitely closed..eg <img src="x.jpg"/>

and not: <img src="x.jpg">

I am making a program that takes an html file and spits out a sass template.

I want to close these img tags in the output file that are not explicitly closed in the html file.

Cheers

2 Answers 2

3

Not exactly a Python-specific solution, but if you want to know which tags have this "self-closing property", you can look at the official HTML5 specs: these are formally known as void elements.

area, base, br, col, embed, hr, img, input, keygen, link, menuitem,
meta, param, source, track, wbr

Strictly speaking, void elements do not have closing tags at all, but permit an extra / immediately before the >.

Sign up to request clarification or add additional context in comments.

Comments

0

Simple solution is to use BeautifulSoup.

In [76]: from bs4 import BeautifulSoup

In [77]: BeautifulSoup('<img src="x.jpg">')
Out[77]: <img src="x.jpg"/>

You can also check if a tag is self closing or not.

from bs4 import BeautifulSoup
from bs4.element import Tag

soup = BeautifulSoup(html)
tags = [tag for tag in soup if isinstacne(tag, Tag)
self_closing = [tag for tag in tags if tag.isSelfClosing]

Every Tag element has isSelfClosing property. So, you can filter them out.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.