xpath javascript in python

Question

I want to parse information on a website and I have been doing it successfully with just pure HTML. For instance for the following code:

<div>
 <ul>
  <h3 This is a heading> </h3>

I would use "answ = pagehtml.xpath('//div/ul/h3'):" and "answ" would be = "This is a heading".

But now I have a web page with a JavaScript that looks like this:

<script>
var XYZ = XYZ || {};
XYZ.contentModel = {
    layout: "no-rail",
    analytics: {
        "pageTop": {},
        "chartbeat": {
            "sections": ""
        },
        "branding_content_page": "default",
        "branding_content_card": [""]
    },
    edition: "Hometown",
    title: "This is the title",
    siblings: {
        "articleList": [{
            "uri": "Got-to-this-webpage.html",
            "description": "",
            "layout": ""
        }]

So I would like to know how do I parse the uri link in this script? Here is what I have tried, but it has failed: answ = pagehtml.xpath('//script/XYZ/siblings/articleList/uri')

What would be the correct xpath to use, if any?

Thanks allot

Markus · Accepted Answer · 2016-11-24 13:48:15Z

1

There is no XPath expression to get what you want. XPath only operates on nodes of the document tree (which is the script element in this case).

So you have to get the string contents of the script element (possibly using XPath) and then manually extract the URI from it. In this case the information you are looking for is encoded in a JSON structure, so you possible can use the JSON capabilities of Python.

answered Nov 24, 2016 at 13:48

Markus

3,4872 gold badges27 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Alfa Bravo Over a year ago

Ah ok, so I can just abandon that thought path. I will start looking at JSON in Python, but have no idea how much I have to learn now just to get this info. :(

Collectives™ on Stack Overflow

xpath javascript in python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related