0

When I try to retrieve some data using xpath from the url in the following code I get an empty list:

from lxml import html
import requests

if __name__ == '__main__':
    url = 'https://www.leagueofgraphs.com/champions/stats/aatrox'

    page = requests.get(url)
    tree = html.fromstring(page.content)

    # XPath to get the XP
    print(tree.xpath('//*[@id="graphDD1"]/text()'))
>>> []

What I expect is a string value like this one:

>>> ['
        5.0%    ']

1 Answer 1

1

This is because the xpath element that you are searching for is within some JavaScript.

You will need to find out the cookie which is generated after the JavaScript has been called so that you can make the same call to the URL.

  1. Go to the 'Network' page of the Dev Console
  2. Find the difference in the request header after abg_lite.js has run (mine was cookie: __cf_bm=TtnYbPlIA0J_GOhNj2muKa1pi8pU38iqA3Yglaua7q8-1636535361-0- AQcpStbhEdH3oPnKSuPIRLHVBXaqVwo+zf6d3YI/rhmk/RvN5B7OaIcfwtvVyR0IolwcoCk4ClrSvbBP4DVJ 70I=)
  3. Add the cookie to your request
from lxml import html
import requests

if __name__ == '__main__':
    url = 'https://www.leagueofgraphs.com/champions/stats/aatrox'

    # Create a session to add cookies and headers to
    s = requests.Session()

    # After finding the correct cookie, update your sessions cookie jar
    # add your own cookie here
    s.cookies['cookie'] = '__cf_bm=TtnYbPlIA0J_GOhNj2muKa1pi8pU38iqA3Yglaua7q8-1636535361-0-'
'AQcpStbhEdH3oPnKSuPIRLHVBXaqVwo+zf6d3YI/rhmk/RvN5B7OaIcfwtvVyR0IolwcoCk4ClrSvbBP4DVJ70I='

    # Update headers to spoof a regular browser; this may not be necessary
    # but is good practice to bypass any basic bot detection
    s.headers.update({
                'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
' AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36'
            })

    page = s.get(url)
    tree = html.fromstring(page.content)

    # XPath to get the XP
    print(tree.xpath('//*[@id="graphDD1"]/text()'))

The following output is achieved: -

['\r\n 5.0% ']

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.