I would like to parse the table "Table 1: Consumer Price Index, historical indices from 1924 (2015=100)" from here: https://www.ssb.no/en/priser-og-prisindekser/konsumpriser/statistikk/konsumprisindeksen
I am using Selenium to open the table that I want to parse (see code below). But the line with pd.read_html throws me the error message
ImportError: html5lib not found, please install it
even though I have installed html5lib (also checked using pip list, version 1.1 is installed). How can I best parse the table?
options = Options()
url = "https://www.ssb.no/en/priser-og-prisindekser/konsumpriser/statistikk/konsumprisindeksen"
driver_no = webdriver.Chrome(options=options, executable_path=mypath)
driver_no.get(url)
sleep(2)
WebDriverWait(driver_no, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="attachment-table-figure-1"]/button')))
elem = driver_no.find_element(By.XPATH, '//*[@id="attachment-table-figure-1"]/button')
sleep(2)
driver_no.execute_script("arguments[0].scrollIntoView(true);", elem)
sleep(2)
driver_no.find_element(By.XPATH, '//*[@id="attachment-table-figure-1"]/button').click()
df_list = pd.read_html(driver_no.page_source, "html_parser")
driver_no.quit()
