I am sorry for the title to better describe the problem when you visit the following website :
There is a text on the right that says "See all". Once you click on that a list of links to various forks pops up. I am trying to scrape the hyperlinks for those forks.
One problem is that the scraper not only scrapes the link for the forks but also for the profiles. They don't use specific class nor ID for those links. So I've edited my script to calculate which result is the right one and which is not. That part works. However the script only scrapes a few links and doesn't scrape others. This confused me because at first I thought that this is caused by the element not being visible to selenium since there is scroll present. This doesn't seem to be the issue however since other links that are not scraped are normally visible. The script only scrapes the first 5 links and completely skips the rest.
I am now unsure on what to do since there is not error or warning about any possible issue with the code itself.
This is a short part of the code that scrapes the links.
driver.get(url)
wait.until(ec.presence_of_element_located((By.CSS_SELECTOR, "button.see-all-forks"))).click()
fork_count = wait.until(ec.presence_of_element_located((By.CSS_SELECTOR, "span.jsx-3602798114"))).text
forks = wait.until(ec.presence_of_all_elements_located((By.CSS_SELECTOR, "a.jsx-2470659356")))
j = 1
for i, fork in enumerate(forks):
if j == 1:
forks[i] = fork.get_attribute("href")
print(forks[i])
if j == 3:
j = 1
else:
j += 1
In this case "url" variable is the link I provided above. The loop then skips 3 results after each one because every 4th one is the right one. I tried using XPath to filter out the results using the "contains" fuction however the names vary as the users name them on their own so this to my understanding is the only way to filter out the results.
This is the output that I get.
After which no results ever are printed out and the program gets terminated without errors. What is happening here and what have I missed? I am confused about why Selenium only scrapes five results after which it is terminated.
Edit note - my code explained :
I've set up the if statements to check for every 4th result since it's the right one however the first one is also the right one. If "j!=3" then add 1 to "j" once "j=3" (now appears the result) the code if "j=1" is ran and the right result is printed. So the right result will always be "j=1".