0

I am trying to loop inside a class offer-list-wrapper which has multiple elements inside, almost all the elements are common in the web page for search A and search B (I am scraping a crawler).

As you can see in both images, offer-list-wrapper is a common element.

I want to extract the data that is inside every organic-offer-wrapper organic-gallery-offer-inner and organic-list-offer-inner m-gallery-product-item-v2 classes. Which is very easy to do if you loop inside them with a CSS selector like this:

for element in driver.find_elements_by_css_selector('.organic-list-offer-inner.m-gallery-product-item-v2'):

In that way you can get every element inside them.

enter image description here

enter image description here

BUT the issue starts here: I need to loop inside both cases with ONE generic code that loop inside both classes, and in case a new class appears it has to loop inside it.

Let me show you my code:

for element in driver.find_elements_by_class_name('offer-list-wrapper'):
    try:
        item_name = element.find_element_by_class_name('organic-gallery-title__content').text
    except:
        item_name = np.nan
    try:
        price = element.find_element_by_class_name('gallery-offer-price').get_attribute('title').replace('$', '').replace(',', '')
        min_order = element.find_element_by_class_name('gallery-offer-minorder').find_element_by_tag_name('span').text.replace(' Pieces', '').replace(' Piece', '').replace(' Units', '').replace(' Unit', '').replace(' Sets', '').replace(' Set', '').replace(' Pairs', '').replace(' Pair', '').replace('Boxes', '').replace('Box', '').replace('Bags', '').replace('Bag', '')     
        # separate min and max price
    except:
        price = np.nan
        min_order = np.nan

This first one returns only the first element:

for element in driver.find_elements_by_css_selector('.organic-offer-wrapper.organic-gallery-offer-inner'):
    try:
        item_name = element.find_element_by_class_name('organic-gallery-title__content').text
    except:
        item_name = np.nan
    try:
        price = element.find_element_by_class_name('gallery-offer-price').get_attribute('title').replace('$', '').replace(',', '')
        min_order = element.find_element_by_class_name('gallery-offer-minorder').find_element_by_tag_name('span').text.replace(' Pieces', '').replace(' Piece', '').replace(' Units', '').replace(' Unit', '').replace(' Sets', '').replace(' Set', '').replace(' Pairs', '').replace(' Pair', '').replace('Boxes', '').replace('Box', '').replace('Bags', '').replace('Bag', '')     
        # separate min and max price
    except:
        price = np.nan
        min_order = np.nan

This second one only loops inside .organic-offer-wrapper.organic-gallery-offer-inner (returning all elements that I need), but it doesn't loop inside .organic-list-offer-inner.m-gallery-product-item-v2

1 Answer 1

1

You can get all the products by searching for the div tags that contain the attribute data-content="productItem". That is assuming each item has that attribute. From the screenshots you posted, it seems like that is the case.

You can accomplish this using find_elements_by_xpath()

for item in driver.find_elements_by_xpath('//div[@data-content="productItem"]'):
    ....

This would probably be the best way without having to worry about the elements having different css classes.

Sign up to request clarification or add additional context in comments.

1 Comment

This worked perfectly fine. In my code I used the word element instead for element in driver.find_elements_by_xpath('//div[@data-content="productItem"]'): THANK YOU for your help! :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.