65

I am using the following regex for validating youtube video share url's.

var valid = /^(http\:\/\/)?(youtube\.com|youtu\.be)+$/;
alert(valid.test(url));
return false;

I want the regex to support the following URL formats:

http://youtu.be/cCnrX1w5luM  
http://youtube/cCnrX1w5luM  
www.youtube.com/cCnrX1w5luM  
youtube/cCnrX1w5luM  
youtu.be/cCnrX1w5luM   

I tried different regex but I am not getting a suitable one for share links. Can anyone help me to solve this.

12 Answers 12

119

Here's a regex I use to match and capture the important bits of YouTube URLs with video codes:

^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(?:-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|live\/|v\/)?)([\w\-]+)(\S+)?$

It works with the following URLs:

https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
//www.youtube.com/watch?v=DFYRQ_zQ-gk
www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
//youtube.com/watch?v=DFYRQ_zQ-gk
youtube.com/watch?v=DFYRQ_zQ-gk

https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
//m.youtube.com/watch?v=DFYRQ_zQ-gk
m.youtube.com/watch?v=DFYRQ_zQ-gk

https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
//www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US

https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
//www.youtube.com/embed/DFYRQ_zQ-gk
www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
//youtube.com/embed/DFYRQ_zQ-gk
youtube.com/embed/DFYRQ_zQ-gk

https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
//www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://youtube-nocookie.com/embed/DFYRQ_zQ-gk
//youtube-nocookie.com/embed/DFYRQ_zQ-gk
youtube-nocookie.com/embed/DFYRQ_zQ-gk

https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
//youtu.be/DFYRQ_zQ-gk
youtu.be/DFYRQ_zQ-gk

https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk

https://www.youtube.com/live/sMbxjePPmkw?feature=share

The captured groups are:

  1. protocol
  2. subdomain
  3. domain
  4. path
  5. video code
  6. query string

https://regex101.com/r/vHEc61/1

Sign up to request clarification or add additional context in comments.

4 Comments

youtube.com/foo_bar- and youtube.com/foo_bar and youtube.com/watch?v= are not valid YouTube video URLs, but this regex will match them.
It doesn't match a valid link like youtube.com/live/sMbxjePPmkw?feature=share . I have added |live\/ after |embed\/ part. Final regex version: ^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|live\/|v\/)?)([\w\-]+)(\S+)?$
This matches youtuxbe/- which is not a valid YouTube URL. Changing the regex to ^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(-nocookie)?\.com|youtu\.be))(\/(?:[\w\-]+\?v=|embed\/|live\/|v\/)?)([\w\-]+)(\S+)?$ fixes this for me.
Youtube video IDs are exactly 11 characters long, anything else is invalid and therefore should not be matched. So I changed the ending ...([\w\-]+)(\S+)?$ pattern to ...([\w\-]{11})((?:\?|\&)\S+)?$. This also fixes the issues which were pointed out by @JoeyMason. Other commented fixes are included as well: regex101.com/r/ciUbdv/1
59
  • You're missing www in your regex
  • The second \. should optional if you want to match both youtu.be and youtube (but I didn't change this since just youtube isn't actually a valid domain - see note below)
  • + in your regex allows for one or more of (youtube\.com|youtu\.be), not one or more wild-cards.
    You need to use a . to indicate a wild-card, and + to indicate you want one or more of them.

Try:

^(https?\:\/\/)?(www\.youtube\.com|youtu\.be)\/.+$

Live demo.

If you want it to match URLs with or without the www., just make it optional:

^(https?\:\/\/)?((www\.)?youtube\.com|youtu\.be)\/.+$

Live demo.

Invalid alternatives:

If you want www.youtu.be/... to also match (at the time of writing, this doesn't appear to be a valid URL format), put the optional www. outside the brackets:

^(https?\:\/\/)?(www\.)?(youtube\.com|youtu\.be)\/.+$

youtube/cCnrX1w5luM (with or without http://) isn't a valid URL, but the question explicitly mentions that the regex should support that. To include this, replace youtu\.be with youtu\.?be in any regex above. Live demo.

2 Comments

I think the question mark in youtu\.?be is wrong: you allways want to have the exact string youtu.be in the URL if the URL is indeed is pointing to http(s)://youtu.be. ?
@TomášPospíšek Edited.
20

I know I'm like 2 years late to the party, but I was needing to write something up anyway, and seems to fit every test case that I can throw at it. Should be able to reference the first match ($1) to get the ID. Matches the http, https, www and non-www, youtube.com, youtu.be, /watch? and /watch.php? on youtube.com (youtu.be does not use these), and it supports matching even when there are other variables in the URL string (?t= for time, ?list= for playlists, etc).

(?:https?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]+)

3 Comments

Any chance you could update this to support youtube.com/watch/IDHERE, which is valid?
@JacobMorrison Another two years late, but what the hell - updated the code :)
^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*$ Improved it a bit so it checks entry starts and ends with url, so things like extra text youtube.com/embed/DFYRQ_zQ-gk extra text are not valid. Also added validation id is not less than 7 symbols
12

Format for YouTube videos has changed. This regex works for all cases:

^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_])+

Tests here.

5 Comments

what has changed? phuc77's answer seems better.
Not all of these tests will pass using phuc77's answer: regex101.com/r/RyE7OM/2/tests. Specifically, youtube.com/foo_bar and youtube.com/watch?v= should not validate.
This answer should be used by anyone searching for a solution. It is the best I've found till now.
If you wanted to catch the ID, then there's a typo in your regex, the + sign at the end should be before the last parenthesis because otherwise it's going to capture only last letter. The final regex should look like this ^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_]+)
The phuc77 seems better, this answer doesn't pass all the test : regexr.com/4b2fh
6

I took one of the answers from here and added support for a few edge cases that I noticed in my dataset. This should work for pretty much any valid url.

^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*(?:[&\/\#].*)?$

2 Comments

This seems to catch all the cases plus it gets the video ID in a separate group
This works better than any other regex in this thread, Thank @zmanplex
5

Based on so many other regex; this is the best I have got:

((http(s)?:\/\/)?)(www\.)?((youtube\.com\/)|(youtu.be\/))[\S]+

Test: http://regexr.com/3bga2

Comments

3

Try this:

((http://)?)(www\.)?((youtube\.com/)|(youtu\.be)|(youtube)).+

http://regexr.com?36o7a

1 Comment

There are a few unnecessary brackets there - ...(youtube\.com/|youtu.be|youtube).*, and you probably want to escape the . in youtu.be, and you may want to put the / outside (so it's included for youtu.be and youtube).
1

I tried this one and it works fine for me.

(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)

You can check here https://regex101.com/r/Kvk0nB/1

1 Comment

Are you sure that this is working?
1

https://regexr.com/62kgd

^((http|https)\:\/\/)?(www\.youtube\.com|youtu\.?be)\/((watch\?v=)?([a-zA-Z0-9]{11}))(&.*)*$

https://www.youtube.com/watch?v=YPz9zqakRbk

https://www.youtube.com/watch?v=YPz9zqakRbk&t=11

http://youtu.be/cCnrX1w5luM&y=12

http://youtu.be/cCnrX1w5luM

http://youtube/cCnrXswsluM

www.youtube.com/cCnrX1w5luM

youtube/cCnrX1w5luM

Comments

0

Modified from phuk using

  • capturing only-token / using non-capturing groups for all but token
  • multi-line with comments /x or here @x x(PCRE_EXTENDED)
  • using @ as delimiters as to be able to use / without escape.
  • non-escape on - at end of character lists.
    E.g. [\w-] not [\w\-]

Example at regex101 with an experimental inclusion of # Possible: oembed?url=...v=:

https://regex101.com/r/0pZCmF/1

$yttok_regex = <<<EOR
@^

# Possible: http://
#       https://
#       //
(?:(?:https?:)?//)?

# Possible: www.
#       m.
(?:(?:www|m)\.)?

# Possible: youtube.com
#       youtube-nocookie.com
#       youtu.be
(?:(?:youtube(?:-nocookie)?\.com|youtu.be))?

# Possible: /[a-zA-Z0-9_-]+?v=
#       /embed/
#       /v/
(?:/(?:[\w-]+\?v=|embed/|v/)?)?

# TOKEN:    [a-zA-Z0-9_-]
([\w-]+)

# Possible:
#       Anything not space+
(?:\S+)?

# EOF pattern with x(PCRE_EXTENDED) flag:
$@x
EOR;

Optionally use:

# TOKEN:    [a-zA-Z0-9_-]
([\w-]{11})

To match only 11-char long tokens.

1 Comment

(PS: as SO still refuses to honor 8-char wide tab, the lineup is not as nice as locally, - but hey. They also do what ever they can to mangle up the source, - which used to work. But they clearly hate tabs to eternity and back and would like all code to be non-indented. Likely negative indented just to make it even worse. But hey. Why use letters at all in code - perhaps we should start using emoticons instead. (The code would in many cases be more readable))
0

This is what I use in my scripts:

^(?:(?:https?:)?\/\/)?(?:(?:(?:www|m(?:usic)?)\.)?youtu(?:\.be|be\.com)\/(?:shorts\/|live\/|v\/|e(?:mbed)?\/|watch(?:\/|\?(?:\S+=\S+&)*v=)|oembed\?url=https?%3A\/\/(?:www|m(?:usic)?)\.youtube\.com\/watch\?(?:\S+=\S+&)*v%3D|attribution_link\?(?:\S+=\S+&)*u=(?:\/|%2F)watch(?:\?|%3F)v(?:=|%3D))?|www\.youtube-nocookie\.com\/embed\/)([\w-]{11})[\?&#]?\S*$

It matches:

    various protocols (and lack of one):
https://www.youtube.com/watch?v=U9t-slLl30E
http://www.youtube.com/watch?v=U9t-slLl30E
//www.youtube.com/watch?v=U9t-slLl30E
www.youtube.com/watch?v=U9t-slLl30E

    and for each protocol
    various domains:
www.youtube.com/watch?v=U9t-slLl30E
m.youtube.com/watch?v=U9t-slLl30E
music.youtube.com/watch?v=OD3F7J2PeYU
youtube.com/watch?v=U9t-slLl30E
www.youtube-nocookie.com/embed/U9t-slLl30E
youtu.be/U9t-slLl30E

    and for each domain (despite -nocookie)
    various paths:
youtube.com/watch?v=U9t-slLl30E
youtube.com/watch/U9t-slLl30E
youtube.com/v/U9t-slLl30E
youtube.com/embed/U9t-slLl30E
youtube.com/e/U9t-slLl30E
youtube.com/live/9UMxZofMNbA
youtube.com/shorts/gOcxEMJSksg
youtube.com/oembed?url=http%3A//www.youtube.com/watch?v%3DU9t-slLl30E&format=json
youtube.com/attribution_link?a=JdfC0C9V6ZI&u=%2Fwatch%3Fv%3DU9t-slLl30E%26feature%3Dshare
youtube.com/attribution_link?a=8g8kPrPIi-ecwIsS&u=/watch%3Fv%3DU9t-slLl30E%26feature%3Dem-uploademail

    and for each path
    various parameters:
youtube.com/watch?v=U9t-slLl30E
youtube.com/watch?v=U9t-slLl30E&feature=shared
youtube.com/watch?v=U9t-slLl30E&t=1m02s
youtube.com/watch?v=U9t-slLl30E&lc=UgyYsn3aIQWSA19Esi54AaABAg
youtube.com/watch?v=Lo2qQmj0_h4&list=PLmXxqSJJq-yVWpRFGImHYZBQTuBGLjG4t&index=5&pp=iAQB8AUB
    in various order:
youtube.com/watch?feature=shared&v=U9t-slLl30E

    but not these:
(wrong ID)
youtube.com/watch?v=U$t-slLl30E
(too short ID)
youtube.com/watch?v=U9t-slLl30&t=10
(wrong or deprecated paths)
youtube.com/GitHub?v=U9t-slLl30E
youtube.com/?v=U9t-slLl30E
youtube.com/?vi=U9t-slLl30E
youtube.com/?feature=player_embedded&v=U9t-slLl30E
youtube.com/watch?vi=U9t-slLl30E
youtube.com/vi/U9t-slLl30E
(www.youtube-nocookie.com/embed/ only!)
youtube-nocookie.com/embed/U9t-slLl30E
www.youtube-nocookie.com/watch?v=U9t-slLl30E
http://www.youtube-nocookie.com/v/U9t-slLl30E?version=3&hl=en_US&rel=0
(playlist)
youtube.com/playlist?list=PLmXxqSJJq-yVWpRFGImHYZBQTuBGLjG4t

Try it https://regex101.com/r/7upRfP/

Also catches the video ID. If you want you can restrict the video ID further with Glenn's answer instead of ([\w-]{11}).

I'll try to keep this updated on gist https://gist.github.com/Kaligula0/1ff5f4e2cf1f351daeca3450f71fdcb5.

Comments

-5

Check this pattern instead:

r'(?i)(http.//|https.//)*[A-Za-z0-9._%+-]+\.\w+'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.