3

I'm creating a site where the user unfortunately has to provide a regex to be used in a MySQL WHERE clause. And of course I have to validate the user input to prevent SQL injection. The site is made in PHP, and I use the following regex to check my regex:

/^([^\\\\\']|\\\.)*$/

This is double-escaped because of PHP's way of handling regexes. The way it's supposed to work is to only match safe regexps, without unescaped single quotes. But being mostly self-taught, I'd like to know if this is a safe way of doing it.

3 Answers 3

9

If you use prepared statements, SQL injection will be impossible. You should always use prepared statements.

Roborg makes an excellent point though about expensive regexes.

Sign up to request clarification or add additional context in comments.

2 Comments

I also favor using prepared statement parameters, but they can be used only in place of literal values in a SQL statement. If you need to construct a query conditionally using application variables, you need to do interpolation. So SQL injection is still possible.
I'm not sure I follow what you mean about conditional query construction. I build dynamic queries all the time using a mix of known-safe application data and user-entered data, with SQL placeholders/parameters for the user-entered values, and I can't remember the last time I had to interpolate.
2

You should just pass the string through mysql_escape_string or mysql_real_escape_string.

I'd be wary of accepting any old regex though - some of them can run for a long time and will tie up your DB server.

From Pattern Syntax:

Beware of patterns that contain nested indefinite repeats. These can take a long time to run when applied to a string that does not match. Consider the pattern fragment (a+)*

This can match "aaaa" in 33 different ways, and this number increases very rapidly as the string gets longer.

2 Comments

-But wouldn't using mysql_escape_string change the regex? E.g. user enters <pre>/\d+/</pre> to match a series of digits, and instead ends up matching a backslash followed by a series of "d"s. that's why I wanted to roll my own in the first place.
The situation cries for prepared statements. Whatever the users enters goes to the DB 1:1, and they are unbreakable for SQL injection.
-2

If it is anly for the purposes of display this reg expression then most programs simply Html Encode the value and store in the DB and then the Decode on the way out. Again only for Display purposes though, if you need to use the reg exp that is submitted this won't work.

Also know there is a method where the person intent on injecting writes out there SQL, Converts it to varbinary and submits the exec command with the base 64 representation of the query which I have been hit with in the past.

1 Comment

I don't see how this answers the question at all. Can you clarify?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.