Does someone have a regex for validating urls (NOT for finding them inside a text passage)? JavaScript snippet would be preferred.
19 Answers
In the accepted answer bobince got it right: validating only the scheme name, ://, and spaces and double quotes is usually enough. Here is how the validation can be implemented in JavaScript:
var url = 'http://www.google.com';
var valid = /^(ftp|http|https):\/\/[^ "]+$/.test(url);
// true
or
var r = /^(ftp|http|https):\/\/[^ "]+$/;
r.test('http://www.goo le.com');
// false
or
var url = 'http:www.google.com';
var r = new RegExp(/^(ftp|http|https):\/\/[^ "]+$/);
r.test(url);
// false
References for syntax:
7 Comments
The actual URL syntax is pretty complicated and not easy to represent in regex. Most of the simple-looking regexes out there will give many false negatives as well as false positives. See for amusement these efforts but even the end result is not good.
Plus these days you would generally want to allow IRI as well as old-school URI, so we can link to valid addresses like:
http://en.wikipedia.org/wiki/Þ
http://例え.テスト/
I would go only for simple checks: does it start with a known-good method: name? Is it free of spaces and double-quotes? If so then hell, it's probably good enough.
4 Comments
Try this regex
/(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
It works best for me.
4 Comments
I've found some success with this:
/^((ftp|http|https):\/\/)?www\.([A-z]+)\.([A-z]{2,})/
- It checks one or none of the following: ftp://, http://, or https://
- It requires www.
- It checks for any number of valid characters.
- Finally, it checks that it has a domain and that domain is at least 2 characters.
It's obviously not perfect but it handled my cases pretty well
1 Comment
This REGEX is a patch from @Aamir answer that worked for me
/((?:(?:http?|ftp)[s]*:\/\/)?[a-z0-9-%\/\&=?\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?)/gi
It matches these URL formats
- yourwebsite.com
- yourwebsite.com/4564564/546564/546564?=adsfasd
- www.yourwebsite.com
- http://yourwebsite.com
- https://yourwebsite.com
- ftp://www.yourwebsite.com
- ftp://yourwebsite.com
- http://yourwebsite.com/4564564/546564/546564?=adsfasd
Comments
You can simple use type="url" in your input and the check it with checkValidity() in js
E.g:
your.html
<input id="foo" type="url">
your.js
$("#foo").on("keyup", function() {
if (this.checkValidity()) {
// The url is valid
} else {
// The url is invalid
}
});
1 Comment
<html>
<head>
<title>URL</title>
<script type="text/javascript">
function validate() {
var url = document.getElementById("url").value;
var pattern = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/;
if (pattern.test(url)) {
alert("Url is valid");
return true;
}
alert("Url is not valid!");
return false;
}
</script>
</head>
<body>
URL :
<input type="text" name="url" id="url" />
<input type="submit" value="Check" onclick="validate();" />
</body>
</html>
2 Comments
I couldn't find one that worked well for my needs. Written and post @ https://gist.github.com/geoffreyrobichaux/0a7774b424703b6c0fffad309ab0ad0a
function validURL(s) {
var regexp = /^(ftp|http|https|chrome|:\/\/|\.|@){2,}(localhost|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|\S*:\w*@)*([a-zA-Z]|(\d{1,3}|\.){7}){1,}(\w|\.{2,}|\.[a-zA-Z]{2,3}|\/|\?|&|:\d|@|=|\/|\(.*\)|#|-|%)*$/gum
return regexp.test(s);
}
Comments
Try this regex, it works for me:
function isUrl(s) {
var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
return regexp.test(s);
}
3 Comments
\S+ in the middle of the expression which can expand to match nearly anything, and it's not anchored at the end so you can put any trailing nonsense in. eg. ‘http://@’ or ‘I've got a lovely bunch of "coconuts"’ are ‘valid’.try with this:
var RegExp =/^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?$/i;
Comments
/(?:http[s]?\/\/)?(?:[\w\-]+(?::[\w\-]+)?@)?(?:[\w\-]+\.)+(?:[a-z]{2,4})(?::[0-9]+)?(?:\/[\w\-\.%]+)*(?:\?(?:[\w\-\.%]+=[\w\-\.%!]+&?)+)?(#\w+\-\.%!)?/
1 Comment
: between the protocol and //. Otherwise, it works fine!I use the /^[a-z]+:[^:]+$/i regular expression for URL validation. See an example of my cross-browser InputKeyFilter code with URL validation.
<!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>Input Key Filter Test</title>
<meta name="author" content="Andrej Hristoliubov [email protected]">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<!-- For compatibility of IE browser with audio element in the beep() function.
https://www.modern.ie/en-us/performance/how-to-use-x-ua-compatible -->
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<link rel="stylesheet" href="https://rawgit.com/anhr/InputKeyFilter/master/InputKeyFilter.css" type="text/css">
<script type="text/javascript" src="https://rawgit.com/anhr/InputKeyFilter/master/Common.js"></script>
<script type="text/javascript" src="https://rawgit.com/anhr/InputKeyFilter/master/InputKeyFilter.js"></script>
</head>
<body>
URL:
<input type="url" id="Url" value=":"/>
<script>
CreateUrlFilter("Url", function(event){//onChange event
inputKeyFilter.RemoveMyTooltip();
var elementNewInteger = document.getElementById("NewUrl");
elementNewInteger.innerHTML = this.value;
}
//onblur event. Use this function if you want set focus to the input element again if input value is NaN. (empty or invalid)
, function(event){ this.ikf.customFilter(this); }
);
</script>
New URL: <span id="NewUrl"></span>
</body>
</html>
Also see my page example of the input key filter.
2 Comments
I have tried a few but there were a few issues so I came up with this one.
/(https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\d*\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9]+\.[^\s]{2,}|www\d*\.[a-zA-Z0-9]+\.[^\s]{2,})/gi;
How to use
const isValidUrl = (url = '') => {
if (url) {
var expression =
/(https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\d*\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9]+\.[^\s]{2,}|www\d*\.[a-zA-Z0-9]+\.[^\s]{2,})/gi;
var regex = new RegExp(expression);
return !!url.match(regex);
}
return false;
};
Breakdown
/(
https?:\/\/ # matches http:// or https://
(?:www\d*\.|(?!www\d*\.) # matches an optional "www" prefix with zero or more digits, followed by a dot,
# or excludes "www" prefix followed by digits
)[a-zA-Z0-9][a-zA-Z0-9-]+ # matches the domain name
[a-zA-Z0-9]\. # matches the dot before the top-level domain
[^\s]{2,} # matches the rest of the URL after the domain name
| # or
www\d*\.[a-zA-Z0-9][a-zA-Z0-9-]+ # matches the "www" prefix with zero or more digits, followed by a dot, and the domain name
[a-zA-Z0-9]\. # matches the dot before the top-level domain
[^\s]{2,} # matches the rest of the URL after the domain name
| # or
https?:\/\/ # matches http:// or https://
(?:www\d*\.|(?!www\d*\.) # matches an optional "www" prefix with zero or more digits, followed by a dot,
# or excludes "www" prefix followed by digits
)[a-zA-Z0-9]+\.[^\s]{2,} # matches the domain name and top-level domain
| # or
www\d*\.[a-zA-Z0-9]+\.[^\s]{2,} # matches the "www" prefix with zero or more digits, followed by a dot, and the domain name and top-level domain
)/gi;
Valid URLs
http://www.example.com
https://www.example.co.uk
http://www1.example.com
http://www2.example.com
http://www3.example.com
https://www1.example.co.uk
https://www2.example.co.uk
https://www3.example.co.uk
https://example.com
http://example.com
www.example.com
www1.example.com
www2.example.com
www3.example.com
www.example.co.uk
www1.example.co.uk
www2.example.co.uk
www3.example.co.uk
Invalid URLs
example
example.com
ftp://example.com
ftp://www.example.com
http://www.example
http://www.example.
http://www.example/
http://example./com
Comments
/^(http|ftp)s?:\/\/((?=.{3,253}$)(localhost|(([^ ]){1,63}\.[^ ]+)))$/
explanation:
- URL can start with
http/ftp scan follow, but not necessarily://are a must right after- Maximum length of domain labels with TLD is 253. What we see here is a lookup to check that total length is min 3 (i.e
http://a.b) and max of 253 - Then there's either
localhostordomain-name.TLD. domain-name can be made out of multiple labels, divided by adot(i.ehttps://inner.sub.domain.net), and maximum length of each label is 63. I didn't see anywhere that there's limitation on the TLD length, so I didn't put there any restriction.
What @bobince answered is a real concern.
The latest answers are very close (thanks @Akseli), but they all miss the obligatory dot in the URL and lengths.
The answer I provide above deals with those too.
for further reading:
Comments
From https://www.freecodecamp.org/news/how-to-validate-urls-in-javascript/
function isValidHttpUrl(str) {
const pattern = new RegExp(
'^(https?:\\/\\/)?' + // protocol
'((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|' + // domain name
'((\\d{1,3}\\.){3}\\d{1,3}))' + // OR ip (v4) address
'(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*' + // port and path
'(\\?[;&a-z\\d%_.~+=-]*)?' + // query string
'(\\#[-a-z\\d_]*)?$', // fragment locator
'i'
);
return pattern.test(str);
}
console.log(isValidHttpUrl('https://www.freecodecamp.org/')); // true
console.log(isValidHttpUrl('mailto://freecodecamp.org')); // false
console.log(isValidHttpUrl('freeCodeCamp')); // false
Comments
The regex you provided is almost correct for matching URLs with optional valid protocols. However, it can be refined for better accuracy and readability. Here's an improved version:
^(https?:\/\/|ftp:\/\/)?(www\.)?([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})(\/[a-zA-Z0-9#\/?=&%.\-]*)?$
Explanation:
^: Start of the string.(https?:\/\/|ftp:\/\/)?: Matcheshttp://,https://, orftp://and makes it optional.(www\.)?: Matcheswww.and makes it optional.([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6}): Matches the domain name.[a-z0-9]+: Matches the initial part of the domain.([\-\.]{1}[a-z0-9]+)*: Matches subsequent parts of the domain that may include-or.followed by alphanumeric characters.\.[a-z]{2,6}: Matches the top-level domain (e.g., .com, .in).
(\/[a-zA-Z0-9#\/?=&%.\-]*)?: Matches the path and query string, including allowed characters, and makes it optional.$: End of the string.
Testing the Regex in JavaScript:
const regex = /^(https?:\/\/|ftp:\/\/)?(www\.)?([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})(\/[a-zA-Z0-9#\/?=&%.\-]*)?$/;
const testUrls = [
"yourwebsite.com",
"yourwebsite.com/4564564/546564/546564?=adsfasd",
"www.yourwebsite.com",
"http://yourwebsite.com",
"https://yourwebsite.com",
"ftp://www.yourwebsite.com",
"ftp://yourwebsite.com",
"http://yourwebsite.com/4564564/546564/546564?=adsfasd",
"google.in",
"fb.co",
"live.com",
"test.live",
"http://test.google",
"lop://live.in"
];
testUrls.forEach(url => {
console.log(url.match(regex) ? `Matches: ${url}` : `Does not match: ${url}`);
});
This regex should produce the expected results:
Matches: yourwebsite.comMatches: yourwebsite.com/4564564/546564/546564?=adsfasdMatches: www.yourwebsite.comMatches: http://yourwebsite.comMatches: https://yourwebsite.comMatches: ftp://www.yourwebsite.comMatches: ftp://yourwebsite.comMatches: http://yourwebsite.com/4564564/546564/546564?=adsfasdMatches: google.inMatches: fb.coMatches: live.comMatches: test.liveMatches: http://test.googleDoes not match: lop://live.in
This approach ensures that URLs with valid protocols are matched and invalid protocols are not.