I am trying to webscrape the list of DAOs from masari.io but I am having trouble because I get the following errors:
DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
DevTools listening on ws://127.0.0.1:56691/devtools/browser/b4609671-5e6e-4d25-b09e-4116b3dde4bf
[0525/100030.252:INFO:CONSOLE(1)] "enabling sentry error tracker", source: https://messari.io/static/js/main.977a4794.chunk.js (1)
[0525/100030.951:INFO:CONSOLE(2)] "Unable to refresh token: Login required", source: https://messari.io/static/js/23.778d04d0.chunk.js (2)
[0525/100031.065:INFO:CONSOLE(2)] "
88b d88 88
888b d888 ""
88'8b d8'88
88 '8b d8' 88 ,adPPYba, ,adPPYba, ,adPPYba, ,adPPYYba, 8b,dPPYba, 88
88 '8b d8' 88 a8P_____88 I8[ "" I8[ "" "" 'Y8 88P' "Y8 88
88 '8b d8' 88 8PP""""""" '"Y8ba, '"Y8ba, ,adPPPPP88 88 88
88 '888' 88 "8b, ,aa aa ]8I aa ]8I 88, ,88 88 88
88 '8' 88 '"Ybbd8"' '"YbbdP"' '"YbbdP"' '"8bbdP"Y8 88 88
", source: https://messari.io/static/js/23.778d04d0.chunk.js (2)
[0525/100031.069:INFO:CONSOLE(2)] "Interested in a CHALLENGE? Check out: https://messari.io/quiz", source: https://messari.io/static/js/23.778d04d0.chunk.js (2)
Traceback (most recent call last):
File "c:/Users/Student/webScrape/scraper.py", line 21, in <module>
matches = WebDriverWait(driver, 10).until(
File "C:\Users\Student\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
Backtrace:
Ordinal0 [0x0096B8F3+2406643]
Ordinal0 [0x008FAF31+1945393]
Ordinal0 [0x007EC748+837448]
Ordinal0 [0x008192E0+1020640]
Ordinal0 [0x0081957B+1021307]
Ordinal0 [0x00846372+1205106]
Ordinal0 [0x008342C4+1131204]
Ordinal0 [0x00844682+1197698]
Ordinal0 [0x00834096+1130646]
Ordinal0 [0x0080E636+976438]
Ordinal0 [0x0080F546+980294]
GetHandleVerifier [0x00BD9612+2498066]
GetHandleVerifier [0x00BCC920+2445600]
GetHandleVerifier [0x00A04F2A+579370]
GetHandleVerifier [0x00A03D36+574774]
Ordinal0 [0x00901C0B+1973259]
Ordinal0 [0x00906688+1992328]
Ordinal0 [0x00906775+1992565]
Ordinal0 [0x0090F8D1+2029777]
BaseThreadInitThunk [0x777BFA29+25]
RtlGetAppContainerNamedObjectPath [0x77B77A7E+286]
RtlGetAppContainerNamedObjectPath [0x77B77A4E+238]
I know there is an API for messari.io, but I am almost certain it is only for their assets and not their list of DAOs. I tried using Selenium since it is a dynamic page but I am still having trouble. Here is my code:
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import requests
url = 'https://messari.io/governor/daos'
DRIVER_PATH = 'PATH_TO_DRIVER_ON_MY_PC'
options = Options()
options.headless = True
options.add_argument("--window-size=1920, 1200")
# s = Service('PATH_TO_DRIVER_ON_MY_PC')
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
driver.get('https://messari.io/governor/daos')
try:
matches = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "td")))
# for match in matches:
# print(match.text)
finally:
driver.quit()
Update I fixed the executable_path warning, but I am still getting the same TimeoutException error. And when I run it without headless I also get the following message:
DevTools listening on ws://127.0.0.1:57773/devtools/browser/4450b78d-3a9f-401a-b39c-2c716ecad924
[9628:20616:0525/102300.840:ERROR:device_event_log_impl.cc(214)] [10:23:00.840] USB: usb_device_handle_win.cc:1049 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[9628:20616:0525/102300.841:ERROR:device_event_log_impl.cc(214)] [10:23:00.841] USB: usb_device_handle_win.cc:1049 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
I assume this part is more of a hardware message that I shouldn't worry about based on similar questions bc when I unplugged my mouse it removed one of them.
<frame>then you have to usedriver.switch_tobefore you try to search it.<td>in code. It uses only<div>to create something like table. What do you really want to get?<td>to display it but<div>and it keepsFeiin<h4>- at least in my Firefox on desktop system.