1
<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_1">Tee</a></h1>
<p><a class="name-link" href="dinamic_URL_1">Light Olive</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_2">Tee</a></h1>
<p><a class="name-link" href="dinamic_URL_2">Navy</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_3">Tee</a></h1>
<p><a class="name-link" href="dinamic_URL_3">Black</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_4">sweater</a></h1>
<p><a class="name-link" href="dinamic_URL_4">Light Olive</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_5">sweater</a></h1>
<p><a class="name-link" href="dinamic_URL_5">Navy</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_6">sweater</a></h1>
<p><a class="name-link" href="dinamic_URL_6">Black</a></p>
</div>
</article>

if possible we have to rely on hypertext ('black', 'tee', sweater' and so on) because the website is dynamic and then they could remove tags like h1, p and so on. ty very much for the attention

Suppose I want to click the div of the black sweater (note: we are online on a dynamic website and between the divs and around the divs we assume that there are indeterminate other divs so let's forget that the div of the black sweater is the last one).

  1. We can't rely on URL addresses because they are dynamic.
  2. We can't use
driver.find_element_by_link_text ('sweater'). click ()

because it would click the div of the Light Olive sweater.

  1. We can't use
driver.find_element_by_link_text ('Black'). click ()

because it would click the first div of the black Tee.

As you can see the same article's divs are identical but the second link changes.

2 Answers 2

1

Try with this XPATH:

//div[h1[.="sweater"]][p[.='Black']]

It is searching for div that has child nodes h1 and p with the text you want.

If you do not want to rely on particular tags, use the * symbol that means any element:

//div[*[.='sweater']][*[.='Black']]
Sign up to request clarification or add additional context in comments.

11 Comments

it works perfectly, you won congrats really, but it has one flaw: as the site is dynamic and modifiable I would like to rely on the link text instead of the tags like h1 h2 h3 <p> and so on. can you adapt the response to this need? if so, I'll give you the crown victory
Done, have a look if this is what you meant.
trying to use //div[a[text()='sweater']][a[text()='Black']] but it doesnt work
the fact is that the only certainty we have is that there will be a hypertext while they could play assholes and remove any tags
So //div[*[.='sweater']][*[.='Black']] will not be stable enough? So need it with a?
|
0

You can achieve this with the xpath selectors in two steps(I'm using here lxml.html for example, but it should be easily converted to selenium webdriver .find_element_by_xpath()):

from lxml import html

s = """
<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_1">Tee</a></h1>
<p><a class="name-link" href="dinamic_URL_1">Light Olive</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_2">Tee</a></h1>
<p><a class="name-link" href="dinamic_URL_2">Navy</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_3">Tee</a></h1>
<p><a class="name-link" href="dinamic_URL_3">Black</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_4">sweater</a></h1>
<p><a class="name-link" href="dinamic_URL_4">Light Olive</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_5">sweater</a></h1>
<p><a class="name-link" href="dinamic_URL_5">Navy</a></p>
</div>
</article>

<article>
<div class="inner-article">
<h1><a class="name-link" href="dinamic_URL_6">sweater</a></h1>
<p><a class="name-link" href="dinamic_URL_6">Black</a></p>
</div>
</article>
"""

tree = html.fromstring(s)

# step 1 filter out all divs including Black "items"
divs = [el.getparent().getparent() for el in tree.xpath("//a[contains(text(), 'Black')]")]

# step 2 filter our divs from step one to get the "sweater" item
needle = list(filter(lambda div: div.xpath("h1/a[contains(text(), 'sweater')]"), divs))[0]

Using selenium webdriver should be something like this(not tested, selenium not installed on my dev env):


# step 1 filter out all divs including Black "items"
divs = [el.find_element_by_xpath('..').find_element_by_xpath('..') for el in 
        web_driver.find_element_by_xpath("//a[contains(text(), 'Black')]")]

# step 2 filter our divs from step one to get the "sweater" item
needle = list(filter(
    lambda div: div.find_element_by_xpath("h1/a[contains(text(), 'sweater')]"), divs))[0]

5 Comments

quite good, the problem is the webpage is online (in fact is dynamic)
What do you mean? You have two identifiers ("back" and "sweater") and the solution is filtering elements within "dynamic webpage" in two steps: by first identifier, then by second one.
what you said it's correct but i have to get black sweater div position from a online website and not from a local string, i can't upload all html source code in a string and so search in it. yes it would work but it's not the right way. if you can adjust your answer basing/relyig on a URL then you win
if you have to search an element on a webpage using selenium, i'm sure you will not upload the entire html source code in a string and then search inside it. you want directly identify it and probably your filters idee is on the right way
Updated the answer using selenium webdriver

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.