Why am I unable to find html element with Python and Selenium?

Question

I am having a weird issue with Python and Selenium. I am accessing the URL https://www.biggerpockets.com/users/JarridJ1. When you click more it shows further content. I can understand that it is a React-based website. When I view it on browser and doa View Source I can see the required stuff in a react element <div data-react-class="Profile/Header/Header" data-react-props="{&quot. I tried to automate Firefox via Selenium but I could not even get with that as well. Check the screenshot:

Below is the code I tried:

from time import sleep

from selenium import webdriver
from selenium.webdriver.chrome.options import Options


def parse(u):
    print('Processing... {}'.format(u))
    driver.get(u)
    sleep(2)
    html = driver.page_source
    driver.save_screenshot('bp.png')
    print(html)


if __name__ == '__main__':
    options = Options()
    options.add_argument("--headless")  # Runs Chrome in headless mode.
    options.add_argument('--no-sandbox')  # Bypass OS security model
    options.add_argument('--disable-gpu')  # applicable to windows os only
    options.add_argument('start-maximized')  #
    options.add_argument('disable-infobars')
    options.add_argument("--disable-extensions")
    driver = webdriver.Firefox()
    parse('https://www.biggerpockets.com/users/JarridJ1')

As per the code you are trying to fetch the page source right? So what is the issue that you are facing or what is the error that you are getting — Sameer Arora
– Sameer Arora, Commented Mar 25, 2020 at 9:54
@SameerArora If you do a page source view-source:https://www.biggerpockets.com/users/JarridJ1 within browser, you will find text like [email protected] but when you check html returned by Sleenium, it is not here. — Volatil3
– Volatil3, Commented Mar 25, 2020 at 10:10

Jortega · Accepted Answer · 2020-03-25 11:47:29Z

1

This is a tricky one but I found a way to get to the element you have highlighted. Still not sure why driver.page_source is not return what you are looking for.

def parse(u):
    print('Processing... {}'.format(u))
    driver.get(u)
    sleep(2)
    get_everything = driver.find_elements_by_xpath("//*")
    for element in get_everything:
        print(element .get_attribute('innerHTML'))

    #html = driver.page_source
    #driver.save_screenshot('bp.png')
    #print(html)

Below is my standalone example:

from selenium import webdriver
import time


driver = webdriver.Chrome("C:\Path\To\chromedriver.exe")
driver.get("https://www.biggerpockets.com/users/JarridJ1")
time.sleep(5)
a = driver.find_element_by_xpath("//div[@data-react-class='Profile/Header/Header']")
b = a.get_attribute("data-react-props")
print(b)
c = driver.find_elements_by_xpath("//*")
for i in c:
    print(i.get_attribute('innerHTML'))

edited Mar 25, 2020 at 11:47

answered Mar 25, 2020 at 10:56

Jortega

3,8081 gold badge22 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Volatil3 Over a year ago

I have updated the question. The required text is present though not visible. I just attached the screenshot. I did not click more for it.

Jortega Over a year ago

@Volatil3 I think I understand better what you are looking for now. See the updated answer.

Volatil3 Over a year ago

Wow, yes this works. Though it is not so relevant can I do similar without using Selenium and use requests and bs4 instead?. I mean a = driver.find_element_by_xpath("//div[@data-react-class='Profile/Header/Header']")?

Jortega Over a year ago

I find that requests does not work well for React hosted sites.

Collectives™ on Stack Overflow

Why am I unable to find html element with Python and Selenium?

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related