0

I am trying to scrape dynamically filled webpages like this, on R.

I am trying to do that with RSelenium, but I am open to alternatives. For example, I would happy to do everything with rvest only.

The issue with RSelenium is that it does not start at all (even trying with Chrome). Just after loading the package, this is the output:

> rD <- rsDriver(browser = "firefox", port = 4545L, geckover = "latest")
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking chromedriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking geckodriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking phantomjs versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
[1] "Connecting to remote server"
Could not open firefox browser.
Client error message:
Undefined error in httr call. httr output: Failed to connect to localhost port 4545 after 2259 ms: Couldn't connect to server
Check server log for further details.
Warning message:
In rsDriver(browser = "firefox", port = 4545L, geckover = "latest") :
  Could not determine server status.

I have seen a similar issue in an question from another forum, but the only solution in that case simply seemed to be specifying the port.

With Chrome there appears to be the problem that Chrome is now at version 130, while ChromeDriver only gets to support up to the version 113, if I understand correctly.

6
  • 2
    For rvest you might want to check previous Q&As targeting rvest::read_html_live() - stackoverflow.com/… Commented Oct 28, 2024 at 13:41
  • btw, in that specific case I don't think ou actually need to scrape, download lets you configure a file format and points you to something like https://live.euronext.com/en/pd_es/data/stocks/download?mics=dm_all_stock&initialLetter=&fe_type=csv&fe_decimal_separator=.&fe_date_format=d%2Fm%2FY for CSV export. Commented Oct 28, 2024 at 16:42
  • @margusl yes, thanks. I have actually seen that I can download the list. I am using that page only as an example because it is the weirdest one I have seen: even once saved the page locally, the content I see on the browser does not appear. The page I am actually interested is it.finance.yahoo.com/screener/new. Commented Oct 28, 2024 at 16:56
  • moreover, the screener I'd like to access can only be viewed with a login.... Commented Oct 28, 2024 at 17:26
  • 1
    One example with read_html_live & login: stackoverflow.com/q/78948903/646761 . read_html_live() is basically an interface for chromote, so you might find this - github.com/rstudio/chromote/… - useful. Commented Oct 29, 2024 at 12:50

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.