1

I have a website that has data I want to fetch stored in a javascript. How do I fetch it?

The code is this :- http://pastebin.com/zhdWT5HM

I want to fetch from "var playersData" line. I want to fetch this thing :- "playerId":"showsPlayer" (without quotes obviously). How do I do so?

I've tried beautiful soup. My current script looks like this

q = requests.get('websitelink')
soup = BeautifulSoup(q.text)

searching = soup.findAll('script',{'type':'text/javascript'})
for playerIdin searching:
  x = playerId.find_all('var playersData', limit=1)
  print x

I'm getting [] as my output. I can't seem to figure out my problem here. Please help out guys and gals :)

1 Answer 1

1

BeautifulSoup would only help locating the desired script tag. Then, you would have multiple options: you can extract the desired data with a javascript parser, like slimit, or use regular expressions:

import re

from bs4 import BeautifulSoup

page = """
<script type="text/javascript">
            var logged = true;
            var video_id = 59374;
            var item_type = 'official';

            var debug = false;
            var baseUrl = 'http://www.example.com';
            var base_url = 'http://www.example.com/';
            var assetsBaseUrl = 'http://www.example.com/assets';
            var apiBaseUrl = 'http://www.example.com/common';
            var playersData = [{"playerId":"showsPlayer","userId":true,"solution":"flash","playlist":[{"itemId":"5090","itemAK":"Movie"}]];
</script><script type="text/javascript" >
"""
soup = BeautifulSoup(page)

pattern = re.compile(r'"playerId":"(.*?)"', re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern)

print pattern.search(script.text).group(1)

Prints:

showsPlayer
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.