1

I know there are tonns of questions on here to validate a web address with something like this

/^[a-zA-Z]+[:\/\/]+[A-Za-z0-9\-_]+\\.+[A-Za-z0-9\.\/%&=\?\-_]+$/i

The only problem is, not everybody uses the http:// or whatever comes before so i wanted to find a way to use the preg_match() but not checking for http as a must have but more of a doesn't really matter, i modified it to this but then it rejects the url it it does have http:// in it:

/^[A-Za-z0-9\-_]+\\.+[A-Za-z0-9\.\/%&=\?\-_]+$/i

I was hoping more to validate it on these conditions

  • If it has http:// or www then just ignore this
  • If the .extension is longer than 9 then reject
  • If it contains no full stops

Anybody got an idea, thanks :)

4 Answers 4

2

Can't you just use the built in filter_var function?

filter_var('example.com', FILTER_VALIDATE_URL);

Not sure about the nine chars extension limit, but I guess you could easily check this in an additional step.

Sign up to request clarification or add additional context in comments.

2 Comments

filter_var can be set up to filter and require the scheme or not (the http:// part, if you don't require it it will validate if you have it there or not). People love to reinvent the wheel. And they usually make it square.
@Erik. PHP seems to have removed FILTER_FLAG_SCHEME_REQUIRED constant - at least it's no longer in the manual...
0

Why not have a stage before the regexp to simply remove the http:// if present ? The same would apply to the www. That may make your life a bit easier.

Comments

0
/^(http\://|www\.)/

/^.+?\.\S{0,9}\./

/\./

Those should work for your bullet points?

Comments

-1

not everybody uses the http://

They should. Without a scheme it simply isn't a URL, and omitting it can cause weird problems. For example:

www.example.com:8080/file.txt

This is a valid URL with the non-existant scheme www.example.com:.

If you are sure that the normal scheme should be http:, you could try automatically appending http:// to ‘fix up’ any URL that doesn't begin with https?:, before validation. But you shouldn't allow/keep/return schemeless URLs over the longer term.

Incidentally the current regex you are using is a long way from accurate according to the official URI syntax (see RFC 3986). It will disallow many valid URI characters, not to mention Unicode characters in IRI. If you want a proper validation you should use a real URL-parser; if you just want a quick check for obvious problems you should use something much more permissive. For example just checking for the absence of categorically-invalid characters like space and ".

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.