I'm trying to loop through a job listing website to grab their job listing and do text analysis. For this job I use RSelenium. The code I am working on is as follows:
#### REMOTE.COM ####
remDR$navigate('https://remote.com/jobs/all?query=marketing&country=anywhere')
# click on the cookies policy
remDR$findElement(using = 'xpath', '//*[@id="ccc-notify-accept"]')$clickElement()
# print all job listings
num_links <- 20
for(i in 1:num_links){
remDR$findElement(using = 'xpath',
paste('/html/body/div[2]/main/div/div/div[3]/article[',i,']', sep = ''))$clickElement()
print(remDR$getCurrentUrl())
remDR$goBack()
}
The problem is that when I get the loop started, two issues occur.
First, the print(remDR$getCurrentUrl()) command returns the original url (https://remote.com/jobs/all?query=marketing&country=anywhere), not the page that was clicked on in the first part of the for loop. Second, when remDR$goBack() executes, it takes me back to the previous blank page, as if there was no link clicked on.
To summarize, I think the loop is running faster than Rselenium takes to find and click on the element.
EDIT
Solution was found thanks to a recommendation:
for(i in 1:5){
remDR$findElement(using = 'xpath',
paste('/html/body/div[2]/main/div/div/div[3]/article[',i,']', sep = ''))$clickElement()
Sys.sleep(2) # add time for page to load
print(remDR$getCurrentUrl())
remDR$navigate('https://remote.com/jobs/all?query=marketing&country=anywhere') # .$navigate() works better as it makes the page load and give you time
Sys.sleep(2) # add time for page to load
}
The steps taken were to give chrome time to load the page Sys.sleep(2) and use .$navigate() instead of goBack(), reason is .$navigate() load content in browser. Important note, loop won't work without the final Sys.sleep(2) as you need the first page to completely load before the loop clicks on the second item.
print()withrbind()to this dataframe. For delay useSys.sleep(5)or whatever value it will be. Instead ofgoBack()script it to follow small right arrow on the bottom. Or add a number to&page=2URL part untill there is a result.goBack()alternative. Can you expand more on that point?