1

I can't figure out where the error is in this code. Basically it inserts a code in the search bar, clicks a button and extracts the results:

from seleniumwire import webdriver
import time

API_KEY = 'my_api_key'

proxy_options = {
    'proxy': {
        'https': f'http://scraperapi:{API_KEY}@proxy-server.scraperapi.com:8001',
        'no_proxy': 'localhost,127.0.0.1'
    }
}

url =  'https://www.ufficiocamerale.it/'

vats = ['06655971007', '05779661007', '08526440154']

for vat in vats:

    driver = webdriver.Chrome(seleniumwire_options=proxy_options)
    driver.get(url)

    time.sleep(5)

    item = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//input[@id="search_input"]')
    item.send_keys(vat)

    time.sleep(1)

    button = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//p//button[@type="submit"]')
    button.click()

    time.sleep(5)

    all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
    for item in all_items:
        if '@' in item.text:
            print(item.text.split(' ')[1])

driver.close()

Running the script (chromedriver.exe is saved in the same folder and I'm working in Jupyter Notebook, if it matters) I get

NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//form[@id="formRicercaAzienda"]//input[@id="search_input"]"}

but this element exists, because trying the script without ScraperAPI I get no errors. Can anyone figure out what the problem is?

1 Answer 1

1
  1. Here you are running with a loop for 3 vat values.
    After the first click on the search button the result page is presented.
    There is no search input field and search button there!
    So, in order to perform a new search you need to get back to the previous page after getting the data on the result page.
  2. There is no need to create a new instance of web driver each iteration.
  3. Also, you should use Expected Conditions explicit waits instead of hardcoded pauses.
    This should work better:
from seleniumwire import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

API_KEY = 'my_api_key'

proxy_options = {
    'proxy': {
        'https': f'http://scraperapi:{API_KEY}@proxy-server.scraperapi.com:8001',
        'no_proxy': 'localhost,127.0.0.1'
    }
}

url =  'https://www.ufficiocamerale.it/'

vats = ['06655971007', '05779661007', '08526440154']



driver = webdriver.Chrome(seleniumwire_options=proxy_options)
wait = WebDriverWait(driver, 20)
driver.get(url)


for vat in vats:
    input_search = wait.until(EC.visibility_of_element_located((By.XPATH, '//form[@id="formRicercaAzienda"]//input[@id="search_input"]')))
    input_search.clear()
    input_search.send_keys(vat)

    time.sleep(0.5)

    wait.until(EC.visibility_of_element_located((By.XPATH, '//form[@id="formRicercaAzienda"]//p//button[@type="submit"]'))).click()

    wait.until(EC.visibility_of_element_located((By.XPATH, '//ul[@id="first-group"]/li')))

    time.sleep(0.5)

    all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
    for item in all_items:
        if '@' in item.text:
            print(item.text.split(' ')[1])
    driver.execute_script("window.history.go(-1)")

driver.close()

UPD
This code is working, the output is

[email protected]
[email protected]
[email protected]

Sign up to request clarification or add additional context in comments.

21 Comments

It doesn't work. I get TimeoutException on the first line inside the for loop.
Right, I just tried that. We cannot click submit button there instantly after inserting the search text. Also, the search input should be cleaned before inserting new texts. Please see the updated code, I tested it, it works correctly.
I get again TimeoutException on the line wait.until(EC.visibility_of_element_located((By.XPATH, '//ul[@id="first-group"]/li'))). If the code we launch is the same, what can be the difference?
Does submit button got clicked and the result is displayed after that when you run it? On my side the answer is yes.
I think this will have no sense. As I feel there is some kind of JS on the page that activated by clicking on elements so that you can not click immediately after that on another element until some process on the page is completed. That's why we need a short delays there. But since it's something inside the page, not dealing with the server side responses increasing the delays wil not help here.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.