2

I am trying to scrape a website for my project, but i'm having trouble with scrapping image names via Selenium from this website

enter image description here

with the below code, I am able to use selenium to return me the text data from the website

results = driver.find_elements_by_css_selector("li.result_content")

for result in results:
    company_name = result.find_element_by_tag_name("h3").get_attribute("innerText")
    product_name = result.find_element_by_id('sProdName').get_attribute("innerText")
    product_paymode = result.find_element_by_id('paymode').get_attribute("innerText")

I was told to use get_attribute("innerText") because there are several items hidden, and get_attribute("innerText") would help me get the hidden items. (True enough, it works)

my question is: How do I scrape the prod-feature-icon class, to tell me if that picture is active or not??

1
  • I have updated my answer Commented Sep 14, 2016 at 13:44

1 Answer 1

2

Why not use find_element_by_class_name ?

feature_icon = result.find_element_by_class_name("prod-feature-icon")

However it's worth noting that the object with this class name is actually a UL within it there are several images so you need to decide which image exactly you want to work with from that. Alternatively you could iterate through them with

for item in feature_icon.find_elements_by_tag_name('img'):
    print(item.get_attribute('src'))

of course this wouldn't still tell you whether the item is active or inactive because that doesn't seem to be dictated by the CSS but rather by the shading of the image

Sign up to request clarification or add additional context in comments.

8 Comments

shouldn't it be find_elements_by_class_name
Well there are two methods, when you want to iterate through the whole set you use the plural, when you don't care about which element you get or you are sure there's only one, you use the singular
I've tried it, but it seems to print out None instead of the image name
Thanks for your quick response! Its definitely getting me something! But it returns multiple rows of <selenium.webdriver.remote.webelement.WebElement (session="f556e2cc0f91051505a03ab078bbdb89", element="0.6230983186130168-646")> Would it be better if i put in the website I'm trying to scrape?
I've updated my question to include the website. in the result_container class, I am trying to draw out the name of the image file in product feature to see if it is active or not
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.