1

Hello, Im new to this fascinating world of r, I have not been able to skip the urls that do not exist, how can I handle it? and don't mark as and error, thanks for your help.


title: "error" author: "FJSG" date: "27/6/2020" output: html_document

knitr::opts_chunk$set(echo = TRUE)

library(xml2)
library(rvest)
library(tidyverse)
library(lubridate)

zora_core <- read_html("https://zora.medium.com/the-zora-music-canon-5a29296c6112")

Los_100 <- data.frame(album      = html_nodes(zora_core, "h1:not(#96c9)") %>% 
                                     html_text() %>% 
                                     str_trim(side = "both"),
                      interprete = html_nodes(zora_core, "strong em , p#73e0 strong") %>% 
                                     html_text() %>% 
                                     str_remove_all("^by") %>%
                                     str_extract("[a-zA-Z].+(?=[(])") %>% str_trim(side = "both"),
                      año        = html_nodes(zora_core, "strong em , p#73e0 strong") %>% 
                                     html_text %>% 
                                     str_extract("([[:digit:]]){4}"),
                      liga       = paste0("https://en.wikipedia.org/wiki/",html_nodes(zora_core,                                       "strong em , p#73e0 strong") %>% 
                                     html_text() %>%
                                     str_remove_all("^by") %>%
                                     str_extract("[a-zA-Z].+(?=[(])") %>% str_trim(side = "both") %>% str_replace_all(" ","_")))

carga <- function(url){
  
         perfil_raw <- read_html(url)
         data.frame(interprete = html_node(perfil_raw, "h1#firstHeading") %>% 
                                 html_text() %>% str_trim(side = "both"))
         
}
lista <- Los_100$liga[1:16] # THE url for the position 16 don´t exist how to avoid that

datos_personales <- map_df(lista,carga)


1 Answer 1

0

It's useful to learn about error-handling in R, but when working with http requests it becomes essential.

In your case, it is best to wrap carga in a tryCatch. This runs an expression that you pass as the first argument and if an error is thrown, it is caught and passed to the second argument of tryCatch, which is a function.

If an error is thrown we need to return a data frame with a single column called interprete so that map_df can bind it together with the other results:

carga_catch <- function(x)
{
  tryCatch(return(carga(x)),
           error = function(e) return(data.frame(interprete = "**inexistente**")))
}

map_df(lista, carga_catch)
#>               interprete
#> 1        Ella Fitzgerald
#> 2          Sarah Vaughan
#> 3         Billie Holiday
#> 4  Sister Rosetta Tharpe
#> 5             Lena Horne
#> 6        Mahalia Jackson
#> 7          Abbey Lincoln
#> 8             Etta James
#> 9         Leontyne Price
#> 10       Marian Anderson
#> 11      Dinah Washington
#> 12                Odetta
#> 13        Dionne Warwick
#> 14          The Supremes
#> 15           Nina Simone
#> 16       **inexistente**

Apart from error handling, I think your code is very good for someone just beginning in R. It achieves a lot in a few lines of code and is perfectly readable. Good work!

Sign up to request clarification or add additional context in comments.

1 Comment

Many thanks , looks easy but I really don't know how to do it, your comments helped me thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.