I'm trying to scrap a website I'm a newbie using regular expressions. I have a long character vector, this is the line that I'm aiming:
<h3 class=\"title4\">Results: <span id=\"hitCount.top\">10,079</span></h3>\n
I want to extract the number that it is in between <span id=\"hitCount.top\"> and </span>. In this case 10,079. My approach so far, though, not really working.
x <- '<h3 class=\"title4\">Results: <span id=\"hitCount.top\">10,079</span>'
m <- gregexpr(pattern="[<span id=\"hitCount.top\">].+[</span>]", x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
regmatches(x, m)
Any help will be appreciated.
sub("<span id=\"hitCount.top\">(.*?)<.*", "\\1", x)[]and not whole string