2

Ok, I have been scratching my head on this for way too long. I am trying to retrieve the url for an embedded video on a web page using Beautiful Soup and requests modules in Python 2.7.6. I inspect the html in chrome and I can see the url to the video but when I get the page using requests and use Beautiful Soup I can't find the "video" node. From looking at the source it looks like the video window is a nested html document. I have searched all over and can't find out why I can't retrieve this. If anyone could point me in the right direction I would greatly appreciate it. Thanks.

here is the url to one of the videos:

http://www.growingagreenerworld.com/episode125/

1 Answer 1

5

The problem is that there is an iframe with the video tag inside which is loaded asynchronously in the browser.

Good news is that you can simulate that behavior by making an additional request to the iframe URL passing the current page URL as a Referer.

Implementation:

import re

from bs4 import BeautifulSoup
import requests

url = 'http://www.growingagreenerworld.com/episode125/'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36'}

with requests.Session() as session:
    session.headers = headers

    response = session.get(url)

    soup = BeautifulSoup(response.content)

    # follow the iframe url
    response = session.get('http:' + soup.iframe['src'], headers={'Referer': url})
    soup = BeautifulSoup(response.content)

    # extract the video URL from the script tag
    print re.search(r'"url":"(.*?)"', soup.script.text).group(1)

Prints:

http://pdl.vimeocdn.com/43109/378/290982236.mp4?token2=1424891659_69f846779e96814be83194ac3fc8fbae&aksessionid=678424d1f375137f
Sign up to request clarification or add additional context in comments.

1 Comment

I'm getting an error: response = session.get('http:' + soup.iframe['src'], headers={'Referer': url}) TypeError: 'NoneType' object has no attribute '__getitem__'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.