1

I have the following HTML that I'm trying to parse using the HTML Agility Pack.

This is a snippet of HTML code:

<body id="station_page" class="">
...
<div>....</div>
<script type="text/javascript"> 
if (Blablabla == undefined) { var Blablabla = {}; }
Blablabla .Data1= "I want this data";
Blablabla .BlablablaData = 
{  "Data2":"I want this data",
"Blablabla":"",
"Blablabla":0   }
{   "Blablabla":123,
"Data3":"I want this data",
"Blablabla":123}
    Blablabla .Data4= I want this data;
</script>...

I'm tring to get those 4 data variable (Data1,Data2,Data3,Data4). first, I tried to found the javascript:

doc.DocumentNode.SelectSingleNode("//script[@type='text/javascript']").InnerHtml

How can I check if it's really the right javascript? After finding the relevant javascript how can I get those 4 data variable (Data1,Data2,Data3,Data4)?

2
  • I think this is the wrong way of doing it. Not sure what's the right way, but this (using htmlagilitypack) isn't it. Commented Mar 8, 2013 at 14:47
  • Sounds like you need to execute the javascript, not just to parse it? If so then here's one way to do it: stackoverflow.com/questions/2530789/… Commented Mar 8, 2013 at 14:53

1 Answer 1

4

You can't parse javascript with HTML Agility Pack, it only supports HTML parsing. You can get to the script you need with an XPATH like this:

doc.DocumentNode.SelectSingleNode("//script[contains(text(), 'Blablabla')]").InnerHtml

But you'll need to parse the javascript with another method (regex, js grammar, etc.)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.