0

I am reading a html page as a string and use tree = html.fromstring(data)

I now want to use lxml xpath to query. Below are an example of the part i am interested in.

<table class="class">
 <tbody>
  <tr>
   <th class="classTh">
    Overall
   </th>
   <td class="classTd">
    <span class="classSpan">
     GREEN
    </span>
   </td>
  </tr>
 </tbody>
</table>

with the call

 xpath = '//table/tbody/tr[th="Overall"]/td/span'
 e = tree.xpath(xpath)
  for i in e:
   print(i.text)

I am using xpath to get the data i need. But i cannot get the xpath to work. Using this exact code + xpath in any online tester works for me.

I have tried with xpath:

xpath = '//table/tbody/tr[th]/td/span'

which gets me all elements instead of the ones with the correct filter value.

 xpath ='//table/tbody/tr[td/span]/th'

gets me all the filter values.

So my question. How to i apply the text value filter in my xpath correctly?

2
  • When you try it with online XPath tester you handle webpage with already executed JavaScript, while your table might be generated dynamically and HTTP libraries like requests, urllib, etc could only provide you with page source without JavaScript executed Commented Jul 5, 2017 at 11:11
  • Because the 2 latest xpath queries works to confirm the data i'm querying against is correct i didn't think it would be any problem with the data itself. Instead i am under the impression its a problem with the query. Or am I missing the point? :) Commented Jul 5, 2017 at 11:41

1 Answer 1

1

The syntax for this xpath in lxml is the following:

xpath = "//table/tbody/tr[th[contains(text(), 'Overall')]]/td/span"

Which solved my problem.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.