Feed URL from HTML using Python

Question

The RSS feed URL is available a site's meta data (if one available). Is there a way to extract the feed URL(S) of a page using urllib2 or HTMLParser modules? Or is there a better module available?

Thanks.

Zach Kelling · Accepted Answer · 2011-11-09 01:06:53Z

2

I prefer lxml. It has a very nice API, and it's XPath support makes this fairly simple to accomplish:

import lxml.html
doc = lxml.html.parse(url_to_site)
feeds = doc.xpath('//link[@type="application/rss+xml"]/@href') # list feed urls

answered Nov 9, 2011 at 1:06

Zach Kelling

54.1k15 gold badges112 silver badges108 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Feed URL from HTML using Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related