0

This is the example of data I'm working with:

<div class="category">
    fruit
</div>
<div class="location">
    <a href="/fruit">fruit</a>
</div>

How the only things that changes is the link in the second div and I'd like to pull the href portion.

How can I target and extract it?

1 Answer 1

1

Update: in XPath, . represents the "context node", or the node selected by the preceding path step. To select the <div class="category"/> where the text contents of the element are equal to "fruit":

/div[@class eq "category"][. eq "fruit"]
  /following-sibling::div[@class eq "location"]/a/@href

If the HTML is formatted with whitespace in the text node (as it is in your example), you can use the contains() function to match part of the text node:

/div[@class eq "category"][contains(., "fruit")]
  /following-sibling::div[@class eq "location"]/a/@href

Original Answer

You can select that href in many different ways. Based on your title, it seems that you are already selecting div/@class eq "category", so you could use the following-sibling axis like this:

/div[@class eq "category"]/following-sibling::div[@class eq "location"]/a/@href
Sign up to request clarification or add additional context in comments.

2 Comments

There are many divs with class="category" and I need to select the one that contains word "fruit". So the content of div determines if i grab it's sibling. How can I do that?
@Jimbotron I added a couple of ways to select based on the text of the element.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.