I've been building this scraper (with some massive help from users here) to get data on some companies' debt with the public sector and I've been able to get to the site, input the desired search parameters and scrape the first 50 results (out of 300). The problem I've encountered is that this page's pagination has the following characteristics:
- It does not possess a next page button
- The URL doesn't change with the pagination
- The pagination is done with a Javascript script
Here's the code so far:
path_driver = "C:/Users/CS330584/Documents/Documentos de Defesa da Concorrência/Automatização de Processos/chromedriver.exe"
website = "https://sat.sef.sc.gov.br/tax.NET/Sat.Dva.Web/ConsultaPublicaDevedores.aspx"
value_search = "300"
final_table = []
driver = webdriver.Chrome(path_driver)
driver.get(website)
search_max = driver.find_element_by_id("Body_Main_Main_ctl00_txtTotalDevedores")
search_max.send_keys(value_search)
btn_consult = driver.find_element_by_id("Body_Main_Main_ctl00_btnBuscar")
btn_consult.click()
driver.implicitly_wait(10)
cnpjs = driver.find_elements_by_xpath("//*[@id='Body_Main_Main_grpDevedores_gridView']/tbody/tr/td[1]")
empresas = driver.find_elements_by_xpath("//*[@id='Body_Main_Main_grpDevedores_gridView']/tbody/tr/td[2]")
dividas = driver.find_elements_by_xpath("//*[@id='Body_Main_Main_grpDevedores_gridView']/tbody/tr/td[3]")
for i in range(len(empresas)):
temp_data = {'CNPJ' : cnpjs[i].text,
'Empresas' : empresas[i].text,
'Divida' : dividas[i].text
}
final_table.append(temp_data)
How can I navigate through the pages in order to scrape their data ? Thank you all for the help!