The html code is blind and It contains the string "PRICE" in html. That partial string has to be matched with html text.If the text matches(partial match) using xpath.It should return the particular html tag path.
Note: I need to automate this logic for multiple sites.I should have to use the generic rule (For locating "Price",Fetching Parent tag)
This is example:
html="""<div id = "price_id">
<span id = "id1"></span>
<div class="price_class">
<bold>
<strong>
<label>PRICE:</label> 125 Rs.
</bold>
</strong>
</br>
</br>
</div>"""
I used lxml
from lxml.html.clean import Cleaner
cleaner =Cleaner(page_structure=False)
cl = cleaner.clean_html(html)
cleaned_html = fromstring(cl)
for element in cleaned_html:
if element.text == 'PRICE':
print "matched"
How it would be written using Xpath expression?
I just need to get the div class path using xpath expression.
Also The problem is if I locate the "PRICE:" string. I should have to get the parent valid tag that is "div" with class name "price_class". but here i should have to skip or remove the unwanted tags like font,bold,italic...
Could you please suggest me to get the parent valid tag of the located string?