1

I've this javascript regular expression that check if an URI is valid (RFC 3986):

/^(https?):\/\/((?:[a-z0-9.-]|%[0-9A-F]{2}){3,})(?::(\d+))?((?:\/(?:[a-z0-9-._~!$&'()*+,;=:@]|%[0-9A-F]{2})*)*)(?:\?((?:[a-z0-9-._~!$&'()*+,;=:\/?@]|%[0-9A-F]{2})*))?(?:#((?:[a-z0-9-._~!$&'()*+,;=:\/?@]|%[0-9A-F]{2})*))?$/i

Now i need to convert that in a MySQL query, using REGEXP.

Eg:

SELECT *
FROM table_name t
WHERE t.uri REGEXP '....'

Could you help me?

5
  • What is MySql version? Commented Nov 28, 2019 at 10:13
  • 1
    Possible duplicate stackoverflow.com/questions/9033018/… Commented Nov 28, 2019 at 10:13
  • @WiktorStribiżew MySql version is 5.7.24-enterprise-commercial-advanced Commented Nov 28, 2019 at 11:40
  • @dbramwell this solution is more generic, i need to respect the RFC 3986 Commented Nov 28, 2019 at 11:41
  • 1
    1) Double all ' chars inside '...' string literals. 2) Replace all (?: with (. 3) Certainly remove the first / and last /i and all \/ with /, 4) Replace \d with [0-9]. 5) Most likely, replace \? with \\? Commented Nov 28, 2019 at 11:46

1 Answer 1

1

You need to

  • Double all ' chars inside '...' string literals
  • Replace all (?: with ( as MySQL legacy versions used a POSIX compliant regex engine that does not support non-capturing groups
  • Certainly remove the first / and last /i since the pattern is passed as a string in MySQL, not as a regex literal, and the pattern is case insensitive by default, no need to add i anywhere (or add A-Z manually in case some global settings are overridden)
  • Replace all \/ with / just to keep the regex clean
  • Replace \d with [0-9] (again, POSIX is not aware of shorthand character classes, although you may also use POSIX character classes, e.g. [[:digit:]] to match any digit)
  • Most likely, replace \? with \\?, or just use [?] to match a literal ? symbol
  • Always use a literal hyphen at the end or start of a character class (bracket expression in POSIX regex).

Use

WHERE t.uri REGEXP '^https?://(([A-Za-z0-9.-]|%[0-9A-Fa-f]{2}){3,})(:[0-9]+)?((/([A-Za-z0-9._~!$&''()*+,;=:@-]|%[0-9A-Fa-f]{2})*)*)([?](([A-Za-z0-9._~!$&''()*+,;=:/?@-]|%[0-9a-fA-F]{2})*))?(#(([A-Za-z0-9._~!$&''()*+,;=:/?@-]|%[0-9A-Fa-f]{2})*))?$'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.