1

How can I get the XPath to get all the href of the products anchor on this page https://www.amazon.com/s/ref=lp_11444071011_nr_p_8_1/132-3636705-4291947?rh=n%3A3375251%2Cn%3A%213375301%2Cn%3A10971181011%2Cn%3A11444071011%2Cp_8%3A2229059011. I want to get the href of the links that are the same as the below link. How can I retreive the href of the links that contains https://www.amazon.com/ so the products links with Xpath and selenium. I will appreciate any help.

<a class="a-link-normal s-access-detail-page  s-color-twister-title-link a-text-normal" title="Under Armour Men's Tech Short Sleeve T-Shirt" href="https://www.amazon.com/Shortsleeve-T-Shirt-Under-Armour-Midnight/dp/B00783KT9Y/ref=sr_1_4?s=sports-and-fitness-clothing&amp;ie=UTF8&amp;qid=1516968485&amp;sr=1-4&amp;refinements=p_8%3A2229059011"><h2 data-attribute="Under Armour Men's Tech Short Sleeve T-Shirt" data-max-rows="0" class="a-size-base s-inline  s-access-title  a-text-normal">Under Armour Men's Tech Short Sleeve T-Shirt</h2></a>

0

2 Answers 2

4

find all a tag whose href starts with the url and get that href

//a[starts-with(@href, 'https://www.amazon.com/')]/@href
Sign up to request clarification or add additional context in comments.

1 Comment

Nice! I didn't know the starts-with option
1

this should works

# selenium imports
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

LINKS_XPATH = '//*[contains(@id,"result")]/div/div[3]/div[1]/a'
browser = webdriver.Firefox()
browser.get('https://www.amazon.com/s/ref=lp_11444071011_nr_p_8_1/132-3636705-4291947?rh=n%3A3375251%2Cn%3A%213375301%2Cn%3A10971181011%2Cn%3A11444071011%2Cp_8%3A2229059011')
links = browser.find_elements_by_xpath(LINKS_XPATH)
for link in links:
    href = link.get_attribute('href')
    print href

1 Comment

You're welcome! Here, the imports, EC, By and WebdriverWait are not used, but I recommand you to use it instead of simply doing "find_element..." because it prevents a lot of exceptions

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.