0

I am using selenium to scrape a web page, dynamically generated by javascript. It works fine when I make call from cmd(python) terminal directly. But does't work fine when I implemented this functionality in class.

My class Implementation is:

    class web_scraper():
        def __init__(self):
            # start chrome driver 
            self.driver = webdriver.Chrome(executable_path="./config/chromedriver.exe")
        
       # scrape web page from specified url
        def scrape_page(self, url):
            html = None
            try:
                # scrape page
                self.driver.get(url)
                
                # read html 
                html = self.driver.execute_script("return document.documentElement.innerHTML;")
            except Exception as e:
                print('[Error:] Scrapping failed.')
                print(f'[Exception:] {e}')
    
            return html
     if __name__ == '__main__':
         url = "https://wipp.edmundsassoc.com/Wipp/?wippid=1205#taxPage9"
         scraper = web_scraper()
         content = scraper.scrape_page(url)

Code, which I used at terminal is:

driver = webdriver.Chrome(executable_path='E:/Projects/Python_Projects/WebScraping/config/chromedriver.exe')
driver.get("https://wipp.edmundsassoc.com/Wipp/?wippid=1205#taxPage30")
content = driver.execute_script("return document.documentElement.innerHTML;")

Output of class implementation is:

<head>
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <link type="text/css" rel="stylesheet" href="Wipp.css">
    <title>WIPP</title>
  <link rel="stylesheet" href="https://wipp.edmundsassoc.com/Wipp/wipp/gwt/standard/standard.css"><script src="https://wipp.edmundsassoc.com/Wipp/wipp/0D3421F8F9508D2F958C63CE2A48BAD8.cache.js"></script></head>

  <body>
    <script type="text/javascript" language="javascript" src="wipp/wipp.nocache.js"></script>
    <iframe src="javascript:''" id="__gwt_historyFrame" tabindex="-1" style="position:absolute;width:0;height:0;border:0"></iframe>


</body>

While in case of commands on python terminal the output is fine.

Any help regarding this would be appreciable. Thanks!

I am using Windows OS and Python version is 3.6.

1
  • 1
    It takes a little while for the table to render. Try putting some sort of wait (either explicit, or implicit) Commented May 27, 2020 at 20:00

1 Answer 1

1

Add time.sleep() after getting url

self.driver.get(url)
time.sleep(10)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.