0

I have a text list of urls named all_urls.txt. Each url in the text file is on a single line. I want to pass this list to selenium (python) to extract specific data. I can achieve this by using the url's one by one but this is not efficient. My code at present looks like this:-

profile = FirefoxProfile('/home/test/.mozilla/firefox/mfgrtrtr.Default3')
browser = webdriver.Firefox(firefox_profile=profile)
browser.maximize_window()
# get website
browser.get('https://www.some-website.com/')
# get current url
print browser.current_url
# get name & get phone number
name = browser.find_element_by_class_name("name")
print name.text
phone = browser.find_element_by_class_name("phone")
print phone.text

How can I pass the list to browser.get and extract name and phone from each url. Thanks in advance for your help, I am new to python but enjoying the challenge.

1
  • Do you know how to open a file and use a for loop? with open(yourfile) as f:for url in map(str.rstrip, f) ... Commented Mar 23, 2016 at 11:03

2 Answers 2

2

You probably need a for loop, which can iterate over a list. Your code should look something like this:

profile = FirefoxProfile('/home/test/.mozilla/firefox/mfgrtrtr.Default3')
browser = webdriver.Firefox(firefox_profile=profile)
browser.maximize_window()
with open("your_file_name") as in_file:
    for url in in_file:
        # get website
        browser.get(url.strip())
        # get current url
        print browser.current_url
        # get name & get phone number
        name = browser.find_element_by_class_name("name")
        print name.text
        phone = browser.find_element_by_class_name("phone")
        print phone.text

The .strip method call on the URL simply ensures that it has no leading or trailing whitespace - lines read in from a file normally include the trailing newline character.

Sign up to request clarification or add additional context in comments.

Comments

2

Open the file:

my_file = open("all_urls.txt", "r")

Iterate throught it and use the get function on each url:

for url in my_file:
    browser.get(url)
    print ...
    print ...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.