How to recieve inner HTML of a child node in selenium (python)?

Question

I'm trying to iterate through multiple nodes and receive various child nodes from the parent nodes. Assuming that I've something like the following structure:

<div class="wrapper">
    <div class="item">
        <div class="item-footer">
            <div class="item-type">Some data in here</div>
        </div>
    </div>
    <!-- More items listed here -->
</div>

I'm able to receive all child nodes of the wrapper container by using the following.

wrapper = driver.find_element(By.XPATH, '/html/body/div')
items = wrapper.find_elements(By.XPATH, './/*')

Anyways I couldn't figure out how I can now receive the inner HTML of the container containing the information about the item type. I've tried this, but this didn't work.

for item in items:
    item_type = item.item.find_element(By.XPATH, './/div/div').get_attribute('innerHTML')
    print(item_type)

This results in the following error:

NoSuchElementException: Message: Unable to locate element:

Does anybody knows how I can do that?

It would be great if you provide few more child nodes with your example. — KunduK
– KunduK, Commented Dec 9, 2021 at 16:36

Prophet · Accepted Answer · 2021-12-09 16:38:15Z

1

In case all the elements their content you want to get are div with class attribute value item-type located inside divs with class attribute value item-footer you can simply do the following:

elements =  driver.find_element(By.XPATH, '//div[@class="item-footer"]//div[@class="item-type"]')
for element in elements:
    data = element.get_attribute('innerHTML')
    print(data)

answered Dec 9, 2021 at 16:38

Prophet

33.5k28 gold badges58 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

JayPeerachai · Accepted Answer · 2021-12-09 16:39:42Z

1

You can use BeautifulSoup after getting page source from selenium to easily scrape the HTML data.

from bs4 import BeautifulSoup

# selenium code part
# ....
# ....
# driver.page_source is the HTML result from selenium

html_doc = BeautifulSoup(driver.page_source, 'html.parser')
items = html_doc.find_all('div', attrs={'class':'item'})
for item in items:
    text = item.find('div', attrs={'class':'item-type'}).text
    print(text)

Output:

Some data in here

edited Dec 9, 2021 at 16:39

answered Dec 9, 2021 at 16:34

JayPeerachai

3,8523 gold badges19 silver badges34 bronze badges

Comments

KunduK · Accepted Answer · 2021-12-09 16:43:01Z

1

You need to just find the relative xpath to identify each element and then iterate it.

items = driver.find_elements(By.XPATH, "//div[@class='wrapper']//div[@class='item']//div[@class='item-type']")
for item in items:
    print(item.text)
    print(item.get_attribute('innerHTML'))

Or use the css selector

items = driver.find_elements(By.CSS_SELECTOR, ".wrapper >.item .item-type")
for item in items:
    print(item.text)
    print(item.get_attribute('innerHTML'))

answered Dec 9, 2021 at 16:43

KunduK

33.4k5 gold badges19 silver badges42 bronze badges

Collectives™ on Stack Overflow

How to recieve inner HTML of a child node in selenium (python)?

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related