I am writing a web scraper using Selenium for Python. The scraper is visiting the same sites many times per hour, therefore I was hoping to find a way to alter my IP every few searches. What is the best strategy for this (I am using firefox)? Is there any prewritten code/a csv of IP addresses I can switch through? I am completely new to masking IP, proxies, etc. so please go easy on me!
2 Answers
Try using a proxy. There are free options (not so reliable) or paid services.
from selenium import webdriver
def change_proxy(proxy,port):
profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", proxy)
profile.set_preference("network.proxy.http_port", port)
profile.set_preference("network.proxy.ssl", proxy)
profile.set_preference("network.proxy.ssl_port", port)
driver = webdriver.Firefox(profile)
return driver
2 Comments
Your ISP will assign you your IP address. If you sign up for something like hidemyass.com, they will probably provide you with an app that changes your proxy, although I don't know how they do it.
But, if they have an app that cycles you through various proxies, then all your internet traffic will go through that proxy - including your scraper. There's no need for the scraper to know about these proxies or how hide my ass works - it'll connect through the proxies just like your browser or FTP client or ....