getting text value using xpath with python

Question

from lxml import html
import requests
url = 'https://www.bloomberg.com/quote/SPX:IND'
page = requests.get(url)
tree = html.fromstring(page.content)
num = tree.xpath('//*[@id="root"]/div/div/section[2]/div[1]/div/section[1]/section/section[2]/section/div[1]/span[1]/text()')
print (num)

this is the code I have written. I'm trying to get the string 2758.82,from this. but what I get is.

[]

I copied the xpath for that section from the website. I have seen similar questions here, but they didn't help. Is something wrong with my code?

If you still didn't get the number you wised to parsed, there is one more thing you need to do other than what @Arount has already suggested. You need to define a header like requests.get(url,headers={"User-Agent":"Mozilla/5.0"}) to make your scraper more like a human. — SIM
– SIM, Commented Jul 8, 2018 at 13:07
one more thing..how do I access something like <div style="display:inline" data-dobid="dfn"><span>some text.</span></div> and what if <span> has some attributes too? — Vinay Varma
– Vinay Varma, Commented Jul 8, 2018 at 13:33
If you wanna play with the visible tags, try using selenium which will let you parse whatever items you want to grab considering their visible form. — SIM
– SIM, Commented Jul 8, 2018 at 13:39

Arount · Accepted Answer · 2018-07-08 12:25:10Z

2

It's not about the xpath. It's about how the page is generated.

If you check the content of page.content you will see there is no <div id="root" [..]> in the webpage's source. It's because the HTML content is mainly generated via Javascript.

But this is not something that should stop you, if you open the raw html source (from page.content) and look for the value you want (2759.81), you will find a tag: <meta itemprop="price" content="2759.82" /> and another <div class="price">2759.81</div>, you can use one of them:

print(tree.xpath('//*[@itemprop="price"]')[0].get('content'))

Gives

2759.82

answered Jul 8, 2018 at 12:25

Arount

10.5k1 gold badge35 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Vinay Varma Over a year ago

Thanks!!.. and what do you mean by from page.content? should I look for <meta itemprop="price" content="2759.82" /> in actual page source? because when I print page.content I get some unaligned HTML text and I can't find <meta itemprop="price" content="2759.82" /> there....also...when I try to execute the code you suggested....I get IndexError: list index out of range

Collectives™ on Stack Overflow

getting text value using xpath with python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related