1
req = requests.get(url)
tree = etree.HTML(req.text)

now instead of using xpath tree.xpath(...) I would like to know if we can search by class name of id as we do in beautifulSoup soup.find('div',attrs={'class':'myclass'}) I'm looking for something similar in lxml.

1

2 Answers 2

2

The far more concise way to do that in bs4 is to use a css selector:

soup.select('div.myclass') #  == soup.find_all('div',attrs={'class':'myclass'})

lxml provides cssselect as a module (which actually compiles XPath expressions) and as a convenience method on Element objects.

import lxml.html

tree = lxml.html.fromstring(req.text)
for div in tree.cssselect('div.myclass'):
    #stuff

Or optionally you can pre-compile the expression and apply that to your Element:

from lxml.cssselect import CSSSelector
selector = CSSSelector('div.myclass')

selection = selector(tree)
Sign up to request clarification or add additional context in comments.

Comments

1

You say that you don't want to use xpath but don't explain why. If the goal is to search for a tag with a given class, you can do that easily with xpath.

For example, to find a div with the class "foo" you could do something like this:

tree.find("//div[@class='foo']")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.