3

I'm trying to iterate through multiple nodes and receive various child nodes from the parent nodes. Assuming that I've something like the following structure:

<div class="wrapper">
    <div class="item">
        <div class="item-footer">
            <div class="item-type">Some data in here</div>
        </div>
    </div>
    <!-- More items listed here -->
</div>

I'm able to receive all child nodes of the wrapper container by using the following.

wrapper = driver.find_element(By.XPATH, '/html/body/div')
items = wrapper.find_elements(By.XPATH, './/*')

Anyways I couldn't figure out how I can now receive the inner HTML of the container containing the information about the item type. I've tried this, but this didn't work.

for item in items:
    item_type = item.item.find_element(By.XPATH, './/div/div').get_attribute('innerHTML')
    print(item_type)

This results in the following error:

NoSuchElementException: Message: Unable to locate element:

Does anybody knows how I can do that?

1
  • It would be great if you provide few more child nodes with your example. Commented Dec 9, 2021 at 16:36

3 Answers 3

1

In case all the elements their content you want to get are div with class attribute value item-type located inside divs with class attribute value item-footer you can simply do the following:

elements =  driver.find_element(By.XPATH, '//div[@class="item-footer"]//div[@class="item-type"]')
for element in elements:
    data = element.get_attribute('innerHTML')
    print(data)
Sign up to request clarification or add additional context in comments.

Comments

1

You can use BeautifulSoup after getting page source from selenium to easily scrape the HTML data.

from bs4 import BeautifulSoup

# selenium code part
# ....
# ....
# driver.page_source is the HTML result from selenium

html_doc = BeautifulSoup(driver.page_source, 'html.parser')
items = html_doc.find_all('div', attrs={'class':'item'})
for item in items:
    text = item.find('div', attrs={'class':'item-type'}).text
    print(text)

Output:

Some data in here

Comments

1

You need to just find the relative xpath to identify each element and then iterate it.

items = driver.find_elements(By.XPATH, "//div[@class='wrapper']//div[@class='item']//div[@class='item-type']")
for item in items:
    print(item.text)
    print(item.get_attribute('innerHTML'))

Or use the css selector

items = driver.find_elements(By.CSS_SELECTOR, ".wrapper >.item .item-type")
for item in items:
    print(item.text)
    print(item.get_attribute('innerHTML'))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.