parsing JavaScript code in HTML source

Question

How do I parse JavaScript code within HTML source with Python, for example I want to extract the productList object

here is my source below;

<html>
<body>
<div id="content-wrapper" class="row-fluid clearfix" role="contentinfo">
<!-- html content -->
</div>


   <script>
    var productList = { "daaa" : "ddddd"};
   </script>

</body>
</html>

Do either of these help? stackoverflow.com/questions/390992/javascript-parser-in-python stackoverflow.com/questions/18368058/… — Curtis Mattoon
– Curtis Mattoon, Commented Nov 24, 2014 at 21:41
one issue you may encounter at some point is that var productList = { daaa : function() {}}; is valid JS, but not valid JSON. — njzk2
– njzk2, Commented Nov 24, 2014 at 21:43

Victor · Accepted Answer · 2014-11-25 00:42:45Z

1

I suggest you take a look at the BeautifulSoup - it can help you extract JavaScript code from an HTML file (but not parse/run it):

source = """<html>...</html>"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(source)
js_code = soup.find_all("script")[0].text

Then you can use some JavaScript interpreter to run the code and get the variables - there are some out there like this one or this one. Just Google it.

answered Nov 25, 2014 at 0:42

Victor

1588 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

parkerproject Over a year ago

what do you think of using regexp instead to parse the extracted JavaScript?

Victor Over a year ago

@Parker, I am not sure if that's a good idea, never tried to parse any proramming language with regex myself thought. I guess you could try. Btw, you could try to use pyparsing: it allows you to create your own parsers to parse different languages

Michael Dorner · Accepted Answer · 2016-09-21 15:55:46Z

-1

I think you need to add the fuction so the computer can read if it is javascript and python, use this:

script type="text/javascript">  <!-------or python----></script>

edited Sep 21, 2016 at 15:55

Michael Dorner

20.6k16 gold badges94 silver badges132 bronze badges

answered Nov 24, 2014 at 21:44

Ben Riley

12 bronze badges

1 Comment

Elias Benevedes Over a year ago

Hello Ben Riley, Welcome to Stack Overflow! This is not a complete answer; please go back and edit to fully answer the question.

Collectives™ on Stack Overflow

parsing JavaScript code in HTML source

2 Answers 2

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related