1

enter image description here

link to full source code:

http://www.supremenewyork.com/shop/all/sweatshirts

Trying to scrape both the product element and the color of it from the site. I can already pull the name of the product and click that however I want to be able to pull all the products with that certain keyword in it, and then click the one in the color that i want. any help is appreciated.

Edit: what ive tried,

product = driver.find_elements_by_partial_link_text(keyword)
for item in product:
    if item.parent.parent.find("p") == wanted_color:
        item.get_attribute("href")

Error:

Traceback (most recent call last):   File "C:/Users/B/PycharmProjects/BasicSelenium/test.py", line 17, in <module>
if item.parent.parent.find("p") == color:  AttributeError: 'WebDriver' object has no attribute 'parent'
9
  • 1
    Show us what you've got so far so we know where to start. Post some page source code or share a link to the page source instead of screenshots please Commented Apr 10, 2017 at 17:55
  • ah id tried to add an image must not have added i will rn Commented Apr 10, 2017 at 17:59
  • @bren added them! Commented Apr 10, 2017 at 18:02
  • Nice, have you already tried something like this? stackoverflow.com/a/7866938/6085135 Show us the code you've tried so far and how it's not working as expected. Commented Apr 10, 2017 at 18:04
  • edited @bren with an example of what ive tried, ive tried other things too but cant remember exactly what ive put Commented Apr 10, 2017 at 18:09

2 Answers 2

1

For something like this I would write a function that takes in a keyword and a color name. You can take those values and insert them into a single XPath and click on the A tag that is returned.

def select_product(keyword, color)
    driver.find_element_by_xpath("//article//a[contains(., '" + keyword + "')]/../../p/a[contains(., '" + color + "')]").click()

You would call it like

select_product("Geto Boys", "Ash Grey")

Some quick XPath info

// means any depth vs / which means child (one level down)

a[contains(.,"some text")] means find an A tag that contains the text, "some text". The . in the contains() is a shortcut for text() which just means text contained in the element.

/.. means go up one level

So putting this all together, it reads find an ARTICLE tag at any level that has a descendant (any level) A tag that contains the keyword text that has a parent (two levels up) that has a P child that has an A child that contains the color text.

XPath is a programming language unto itself. You'd be better off reading an XPath guide.

Side note... I would suggest that you favor finding elements in this order:

  1. by ID
  2. by CSS selector

...then if you can't find it either of those ways, you fall back to XPath to locate elements by contained text. XPath are slower and not as well supported as CSS selectors. I used it in this case because you needed to find an element based on the contained text or I would have used a CSS selector.

Sign up to request clarification or add additional context in comments.

2 Comments

thanks it worked great, care to expain a little more in depth about how you created that xpath? sorry pretty new to coding python and any info helps!
I updated my answer with some explanation of the XPath and some other recommendations.
0

Here's one way:

from selenium import webdriver

browser = webdriver.Chrome()
browser.get(url)
anchors = browser.find_elements_by_class_name('name-link') 

This gets us a list of alternating tags like this:

<h1><a class="name-link" href="/shop/blahblah">Very Cool Sweatshirt</a></h1>
<p><a class="name-link" href="/shop/blahblah">Red</a></p>  

We can split the list into pairs and extract text as needed:

products = [anchors[i:i+n] for i in range(0, len(anchors), n)]                   
for item in products:
        element, description, color = item[0], item[0].text, item[1].text

Or we can filter for things using parent tag_name:

products = []
for element in anchors:
    if element.find_element_by_xpath('..').tag_name == 'p':  # or 'h1'
        text = element.text
        products.append([element, text])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.