Web scraping video information from youtube using python

Question

I want to extract video information(like title, viewer's counts) of a certain Youtube video using python, just as I did web scraping on other websites. But for some reason, either it returns nothing or provides tags only for recommended videos on the side instead of "the main video" of the URL

I tried the same codes that I used for web-scraping on other websites as below. Apparently it doesn't work on Youtube. What should I do if I want to get video information based on a youtube URL?

import requests
from bs4 import BeautifulSoup

base_url ='https://www.youtube.com/watch?'
search_string = 'v=I41aLSzLI50'
url = base_url + search_string
supers=requests.get(url).content    
data = BeautifulSoup(supers,'html.parser')
videos =data.find_all('a', class_= 'content-link spf-link yt-uix-sessionlink spf-link')
for video in videos:
    print(video.find('span', class_='title').get_text())

first you should check if page doesn't use JavaScript to add content - BeautifulSoup can't run JavaScript. You could also print content from requests to see what you get. Maybe you get something different then you can get in web browser. It can send Captcha or warning message, etc. — furas
– furas, Commented Aug 25, 2019 at 21:12
Is there a reason why you don't use the youtube api? developers.google.com/youtube/v3 — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Aug 25, 2019 at 21:36
no specific reason, just that I'm that beginner only knowing BeautifulSoup. I guess the reason why I couldn't see HTML content of the main video was the page uses JavaScript. Let me try youtube_dl and youtube api as you guys suggested.Big thanks! — Sung Yeon Park
– Sung Yeon Park, Commented Aug 25, 2019 at 22:49
but another question is why I couldn't see any from the code just because it's in Javascript? — Sung Yeon Park
– Sung Yeon Park, Commented Aug 25, 2019 at 22:57

Erol B · Accepted Answer · 2019-08-26 00:24:21Z

I looked up a page on YouTube, and it seems that the you are looking for is not in the original source (at least not where you are expecting it). There are scripts that create the content when your browser renders the page. Based on my experience, you have a few options.

Use one of the APIs the commenters suggested. I am not very familiar with these, but it might same you some time and effort. Web scraping can be problematic because of changes in page format (scripts may need to be updated).
If you insist on web scraping, you can use an automated browser. I used to use Selenium on a regular basis and it should work for your purposes. This will allow you to work with content generated by scripts.
I looked at the page source, and the information you are looking for appears to be contained within some tags, but parsing this will be a pain.

Collectives™ on Stack Overflow

Web scraping video information from youtube using python

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related