Parse HTML to get specific tags in Python

Question

I'm trying to parse an HTML source with Python. I'm using BeautifulSoup for the purpose. What I need to get is to get all td tags with ids in the form of nameX format, where X starts from 1. So they are name1, name2, ... as many as we have.

How can I achieve this? My simple code using regex doesn't work.

soup = BeautifulSoup(response.text,"lxml")
resp=soup.find_all("td",{"id":'name*'})

Error:

IndexError: list index out of range

Andrej Shulaev · Accepted Answer · 2018-10-03 21:13:27Z

1

use lambda + startswith

soup.find_all('td', id=lambda x: x and x.startswith('name'))

or regex

 soup.find_all('td', id=re.compile('^name'))

answered Oct 3, 2018 at 21:13

Andrej Shulaev

581 silver badge5 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Parse HTML to get specific tags in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related