I'm trying to use python and selenium to loop through a list of webpages and download a file on each page. I am able to open one page at a time and download the first file I want with a while loop but as soon as I get to the second element in the list of webpages, selenium seems to error out.
Here is my code:
path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)
browser.get("file:///path to html file")
#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']
index = 0
while (index <= 2):
url = all_trails[index]
browser.get(url)
browser.find_element_by_link_text('Sign In').click()
username = browser.find_element_by_xpath("//input[@placeholder='Log
in with email']")
password = browser.find_element_by_name('pass')
username.send_keys("username")
password.send_keys("password")
browser.find_element_by_xpath("//button[@type='submit' and
@class='btn btn-primary btn-lg' and contains(text(), 'Log
In')]").click()
results_url = browser.find_element_by_xpath("//a[@class='require-
user' and contains(text(), 'GPX File')]").click()
index += 1
browser.quit()
time.sleep(5)
I'm able to download the file from the first element in the array, which is www.google.com. The loop gets to the second list element www.yahoo.com but as soon as it gets to browser.get(url) that's where I run into this error:
Traceback (most recent call last):
File "trails_scraper.py", line 22, in <module>
browser.get(url)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in get
self.execute(Command.GET, {'url': url})
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 306, in execute
response = self.command_executor.execute(driver_command, params)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 460, in execute
return self._request(command_info[0], url, body=data)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 483, in _request
self._conn.request(method, parsed_url.path, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1093, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 575, in create_connection
raise err
socket.error: [Errno 61] Connection refused
Does anyone know what is going on? I know the more error prone method is to use a for loop but logically my code seems correct.
Any help would be magnificently appreciated :)
browser.quit()outsideforloop?