1

i have extract the table that say "R.U.T" and "Entidad" of the page

http://www.svs.cl/portal/principal/605/w3-propertyvalue-18554

I make the follow code:

library(rvest)
    #put page
    url<-paste("http://www.svs.cl/portal/principal/605/w3-propertyvalue-18554.html",sep="")
     url<-read_html(url)
    #extract table

table<-html_node(url,xpath='//*[@id="listado_fiscalizados"]/table') #xpath
table<-html_table(table)

#transform table to data.frame
table<-data.frame(table)

but R show me the follow result:

> a
{xml_nodeset (0)}

That is, it is not recognizing the table, Maybe it's because the table has hyperlinks?

If anyone knows how to extract the table, I would appreciate it. Many thanks in advance and sorry for my English.

2
  • It looks like the table is loaded with JavaScript, so you'll need to grab the HTML via RSelenium or the like. Here's a recent example that you should be able to translate directly. Commented Jan 10, 2017 at 22:07
  • I knew about Rselenium, but I wanted to work on another type of solution. Thank you very much for your answer, if I do not find a different solution I will take Rselenium :) Commented Jan 11, 2017 at 1:39

2 Answers 2

2

It makes an XHR request to another resource which is used to make the table.

library(rvest)
library(dplyr)

pg <- read_html("http://www.svs.cl/institucional/mercados/consulta.php?mercado=S&Estado=VI&consulta=CSVID&_=1484105706447")

html_nodes(pg, "table") %>%
  html_table() %>%
  .[[1]] %>%
  tbl_df() %>%
  select(1:2)
## # A tibble: 36 × 2
##        R.U.T.                                            Entidad
##         <chr>                                              <chr>
## 1  99588060-1                           ACE SEGUROS DE VIDA S.A.
## 2  76511423-3                               ALEMANA SEGUROS S.A.
## 3  96917990-3                      BANCHILE SEGUROS DE VIDA S.A.
## 4  96933770-3                          BBVA SEGUROS DE VIDA S.A.
## 5  96573600-K                              BCI SEGUROS VIDA S.A.
## 6  96656410-5                 BICE VIDA COMPAÑIA DE SEGUROS S.A.
## 7  96837630-6            BNP PARIBAS CARDIF SEGUROS DE VIDA S.A.
## 8  76418751-2 BTG PACTUAL CHILE S.A. COMPAÑIA DE SEGUROS DE VIDA
## 9  76477116-8                            CF SEGUROS DE VIDA S.A.
## 10 99185000-7           CHILENA CONSOLIDADA SEGUROS DE VIDA S.A.
## # ... with 26 more rows

You can use Developer Tools in any modern browser to monitor the Network requests to find that URL.

Sign up to request clarification or add additional context in comments.

2 Comments

This is the solution I was looking for. I changed the url and xpath in code and it work. Thank you very much. One query, how did you know the table came from a reference?
"You can use Developer Tools in any modern browser to monitor the Network requests to find that URL.". It's worth the effort to poke at browser "Inspect" / "Inspect Element" / "Developer Tools". Tons of good stuff under the covers of most web pages.
1

This is the answer using RSelenium:

# Start Selenium Server
RSelenium::checkForServer(beta = TRUE)
selServ <- RSelenium::startServer(javaargs = c("-Dwebdriver.gecko.driver=\"C:/Users/Mislav/Documents/geckodriver.exe\""))
remDr <- remoteDriver(extraCapabilities = list(marionette = TRUE))
remDr$open() # silent = TRUE
Sys.sleep(2)

# Simulate browser session and fill out form
remDr$navigate("http://www.svs.cl/portal/principal/605/w3-propertyvalue-18554.html")
Sys.sleep(2)
doc <- htmlParse(remDr$getPageSource()[[1]], encoding = "UTF-8")

# close and stop server
remDr$close()
selServ$stop()

tables <- readHTMLTable(doc)
head(tables)

2 Comments

You need to show what packages you're loading at the top; it looks like XML as well as RSelenium.
Thank you very much for your answer, this works :D. Anyway I will continue to see a solution without RSelenium.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.