3

I have tried several regex patterns (designed for use with PHP because I couldn't find any for MySQL) for URL validation, but none of them are working. Probably MySQL has a slightly different syntax.

I've also tried to come up with one, but no success.

So does anyone know a fairly good regex to use with MySQL for URL validation?

3
  • What do you want to validate? If the input string is a valid URL that can be resolved? Commented Jan 27, 2012 at 12:06
  • No, I want to SELECT all rows which don't match the URL pattern. (So all invalid URLs) Commented Jan 27, 2012 at 12:07
  • What kind of URLs are allowed? How strict validation are you thinking of? Commented Jan 27, 2012 at 12:17

2 Answers 2

9

According to article 11.5.2. Regular Expressions in MySQL's documentation, you can perform selections with a regular expression with the following syntax

SELECT field FROM table WHERE field REGEX pattern

In order to match simple URLS, you may use

SELECT field FROM table
 WHERE field REGEXP "^(https?://|www\\.)[\.A-Za-z0-9\-]+\\.[a-zA-Z]{2,4}"

This will match most urls like

But not

Sign up to request clarification or add additional context in comments.

4 Comments

That's amazing. Just one more question. Would it be possible to make the http:// or www optional? I want it to also match example.com or subdomain.example.com. Would that be easy?
Make it optional? Where? If you're talking about when inserting, this should all be done prior to inserting, so you wouldn't even need to match urls when selecting. Regarding your question, yes that would be easy to do, but it'd also mean stuff like backup.2012-01-27.xml would get matched.
Note that http://about.museum would not be matched as valid, but www.foo.bar, with spaces! would pass as valid.
Does not work with new domains. Does not work with foreign characters either. дизайнрекламы.рф, online.solutions etc.
1

Although the answer KBA posted works, there are some inconstancies with the escaping.

The proper syntax should be, this syntax works in MySQL as well as in PHP for example.

SELECT field FROM table
 WHERE field REGEXP "^(https?:\/\/|www\.)[\.A-Za-z0-9\-]+\.[a-zA-Z]{2,4}"

The above code will only match if the content of 'field' starts with a URL. If you would like to match all rows where the field contains a url (so for example surrounded by other text / content) just simply use:

SELECT field FROM table
 WHERE field REGEXP "(https?:\/\/|www\.)[\.A-Za-z0-9\-]+\.[a-zA-Z]{2,4}"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.