3

I have the following HTML page. I want to get all the links inside a specific div. Here is my HTML code:

<div class="rec_view">
    <a href='www.xyz.com/firstlink.html'>
        <img src='imga.png'>
    </a>
    <a href='www.xyz.com/seclink.html'>
        <img src='imgb.png'>
    </a>
    <a href='www.xyz.com/thrdlink.html'>
        <img src='imgc.png'>
    </a>
</div>

I want to get all the links that are present on the rec_view div. So those links that I want are,

www.xyz.com/firstlink.html
www.xyz.com/seclink.html
www.xyz.com/thrdlink.html

Here is the Python code which I tried with

from selenium import webdriver;
webpage = r"https://www.testurl.com/page/123/"
driver = webdriver.Chrome("C:\chromedriver_win32\chromedriver.exe")
driver.get(webpage)
element = driver.find_element_by_css_selector("div[class='rec_view']>a")
link = element.get_attribute("href")
print(link)

How can I get those links using selenium on Python?

1
  • are you sure your code does not work ? and if not , please tell us what you get when you run it Commented Apr 30, 2018 at 8:21

1 Answer 1

9

As per the HTML you have shared to get the list of all the links that are present on the rec_view div you can use the following code block :

from selenium import webdriver

driver = webdriver.Chrome(executable_path=r'C:\chromedriver_win32\chromedriver.exe')
driver.get('https://www.testurl.com/page/123/')
elements = driver.find_elements_by_css_selector("div.rec_view a")
for element in elements:
    print(element.get_attribute("href"))

Note : As you need to collect all the href attributes from the div tag so instead of find_element_* you need to use find_elements_*. Additionally, > refers to immediate <a> child node where as you need to traverse all the <a> child nodes so the desired css_selector will be div.rec_view a

Sign up to request clarification or add additional context in comments.

1 Comment

I hava a similar problem. I want to extract all href that correspond to "tif" from link and save it into a list. I tried with by_partial_linked_text("goes-16.rgb.tif") and by_xpath() but in both the list is empty. Using by_css_selector("marco_goes.div.div.modal-body.div.div.div.ol.li a")

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.