Open multiple pages with python selenium

Question

I'm trying to use python and selenium to loop through a list of webpages and download a file on each page. I am able to open one page at a time and download the first file I want with a while loop but as soon as I get to the second element in the list of webpages, selenium seems to error out.

Here is my code:

path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']

index = 0

while (index <= 2):

    url = all_trails[index]
    browser.get(url)

    browser.find_element_by_link_text('Sign In').click()

    username = browser.find_element_by_xpath("//input[@placeholder='Log 
    in with email']")
    password = browser.find_element_by_name('pass')

    username.send_keys("username")
    password.send_keys("password")

    browser.find_element_by_xpath("//button[@type='submit' and 
    @class='btn btn-primary btn-lg' and contains(text(), 'Log 
    In')]").click()

    results_url = browser.find_element_by_xpath("//a[@class='require-
    user' and contains(text(), 'GPX File')]").click()
    index += 1

    browser.quit()
    time.sleep(5)

I'm able to download the file from the first element in the array, which is www.google.com. The loop gets to the second list element www.yahoo.com but as soon as it gets to browser.get(url) that's where I run into this error:

Traceback (most recent call last):
  File "trails_scraper.py", line 22, in <module>
    browser.get(url)
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in get
    self.execute(Command.GET, {'url': url})
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 306, in execute
    response = self.command_executor.execute(driver_command, params)
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 460, in execute
    return self._request(command_info[0], url, body=data)
  File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 483, in _request
    self._conn.request(method, parsed_url.path, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in request
    self._send_request(method, url, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1093, in _send_request
    self.endheaders(body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
    self.send(msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
    self.connect()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect
    self.timeout, self.source_address)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 575, in create_connection
    raise err
socket.error: [Errno 61] Connection refused

Does anyone know what is going on? I know the more error prone method is to use a for loop but logically my code seems correct.

Any help would be magnificently appreciated :)

This helped but it doesn't open up another webpage to download the file. Carlo 1585's answer below allows the webdrive to open another page, I didn't realize it needed the path to chromedriver to open another page. Thanks though! — Brian Dela Cruz
– Brian Dela Cruz, Commented Dec 15, 2017 at 18:56
What do you mean by needed the path to chromedriver to open another page? You need path to chromedriver just to run chromedriver... If chromedriver file already in the Path, you don't need to specify it explicitly. And it definitely cannot affect on opening another page! — Andersson
– Andersson, Commented Dec 15, 2017 at 19:07

Carlo 1585 · Accepted Answer · 2017-12-15 14:35:39Z

So the problem is that you are declaring your browser out of the loop so, when the loop finish the 1 time it close the browser and if fail for your

browser.get(url)

Because there is any browser.

you have 2 solution:

1) you introduce the browser declaration inside of the loop

path_to_chromedriver = 'path to chromedriver location'


#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']

index = 0

while (index <= 2):
    browser = webdriver.Chrome(executable_path = path_to_chromedriver)

    browser.get("file:///path to html file")

    url = all_trails[index]
    browser.get(url)

    browser.find_element_by_link_text('Sign In').click()

    username = browser.find_element_by_xpath("//input[@placeholder='Log 
    in with email']")
    password = browser.find_element_by_name('pass')

    username.send_keys("username")
    password.send_keys("password")

    browser.find_element_by_xpath("//button[@type='submit' and 
    @class='btn btn-primary btn-lg' and contains(text(), 'Log 
    In')]").click()

    results_url = browser.find_element_by_xpath("//a[@class='require-
    user' and contains(text(), 'GPX File')]").click()
    index += 1

    browser.quit()
    time.sleep(5)

2) you close the browser just after the loop

path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']

index = 0

while (index <= 2):

    url = all_trails[index]
    browser.get(url)

    browser.find_element_by_link_text('Sign In').click()

    username = browser.find_element_by_xpath("//input[@placeholder='Log 
    in with email']")
    password = browser.find_element_by_name('pass')

    username.send_keys("username")
    password.send_keys("password")

    browser.find_element_by_xpath("//button[@type='submit' and 
    @class='btn btn-primary btn-lg' and contains(text(), 'Log 
    In')]").click()

    results_url = browser.find_element_by_xpath("//a[@class='require-
    user' and contains(text(), 'GPX File')]").click()
    index += 1
    time.sleep(5)
browser.quit()

The first solution worked but I used a combination of both solutions.

Collectives™ on Stack Overflow

Open multiple pages with python selenium

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related