0

I need to get the HTML markup of a YouTube video page. Here is the code:

async function get_subtitles(video_id: string): Promise<string> {
    if (video_id.includes('https://') || video_id.includes('http://'))
    throw new EvalError("You provided an invalid video id. Make sure you are using the video id and NOT the url!")

    const WATCH_LINK: string = `https://www.youtube.com/watch?v=${video_id}`
    
    const ytRES = await fetch(WATCH_LINK);
    console.log(ytRES)
    const ytHTML = await ytRES.text();

    //extract_captions_json(ytHTML, video_id);
}

When I try to load page, I get this error:

Cross-Origin error

Front-end written on React, maybe this helps.

tried use custom headers, axios and etc. Nothing works.

1

2 Answers 2

1

why do you get cors with react:
your react is most probably set for client side rendering (default for CRA).
in this case, your app's client is accessing your app through http://yourhome.com but is then asked to fetch data from https://youtube.com.
cors sees that both url ain't in no way related (no subdomain or such) and just blocks youtube requests.
what can you do to deal with it:

  • set a proxy which will translate your youtube.com requests to yourhome/youtube.com
    • with a conf in your config.json in the project root, see related answear (can either proxy directly to youtube or a custom server (express server for example))
    • with a reverse proxy server
  • server side rendering (probably, didn't test), with nextjs for exemple
Sign up to request clarification or add additional context in comments.

2 Comments

Maybe there is a workaround? I saw proxy before?.but I thought this is overkill. I just need to get html markup on client and that's it. Maybe there is a something like request from python?
creating a proxy setting in your config.json is probably the simplest way to tackle this issue. any other way i can think of would either need a server service running or prefetching data into a file added to your react app ( but only applicable to predictable / common data ).
0

You can use separate server for this. For example Flask implementation:

from flask import Flask, jsonify
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import TextFormatter

app = Flask(__name__)

@app.route("/api/subtitles/<video_id>")
def get_subtitles(video_id: str):
    if type(video_id) != str:
        return jsonify({
            "status": "error",
            "reason": "Video ID is not a string value o_0"
        })
    
    try:
        transcription = YouTubeTranscriptApi.get_transcript(video_id, languages=("ru", "en"))
        transcription = TextFormatter().format_transcript(transcription)
    
    except Exception as e:
        print(e)
        return jsonify({
            "status": "error",
            "reason": str(e)
        })
    
    return jsonify({
        "status": "success",
        "data": transcription
    })

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.