0

I'm having trouble getting list of all parameters in SQL query using Regex.

Example of the query:

SELECT ... WHERE col1 = :user AND col2 = 'HELLO' OR col3 = :language

To obtain parameters, I use following regex pattern:

Pattern.compile(":([\\w.$]+|\"[^\"]+\"|'[^']+')", Pattern.MULTILINE)

The pattern returns list of parameters correctly:

:user
:language

The problem is with another type of query, where literals might contain character ':'

WHERE col1 = :user AND some_date > '2022-09-26T10:22:55'

The list of parameters for this case is:

:user
:22
:55

Is there any better approach that will not consider contents of literals as parameters?

3
  • Pattern.compile("[^'\":]*:([\\w.$]+|\"[^\"]+\"|'[^']+')", Pattern.MULTILINE) Commented Sep 26, 2022 at 8:47
  • Thanks for response. But in scenario: code <> '*'\nAND code IN (select x from ... where user = :user) it considers \nAND code IN (select x from ... where user = :user to be parameter. Commented Sep 26, 2022 at 8:59
  • Pattern.DOT_ALL will let the . also match newlines. Forgotten Commented Sep 26, 2022 at 10:46

1 Answer 1

1

You could simplify your problem by assuming that a named param in sql is just a word with prefix : and always follows after a space (this is actually not a requirement or always true but might be just good enough to get you acceptable results with as simple of regex as possible)

Pattern.compile(" :\\w+", Pattern.MULTILINE)

--

summary of the comments:

had to match 
- foo = :param AND :param = bar AND foo=:param AND :param=bar
- AND FUNC(:param) OR FUNC(0, :param) OR FUNC(:param, 0)

finally this regex with fixed length lookahead and variable length lookbehind was helpful:

 Pattern.compile("(?<=[=(])\\s*:[\\w_.]+|:[\\w_.]+(?=\s*[=)])", Pattern.MULTILINE)
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for the answer. As much as I wish I could, I can't. In my queries, there are parameters containing "_" or "." as well and might not be prefixed with whitespace. Unfortunately I'm not able to affect it as the sql query is somehow extrernaly generated.
what's about Pattern.compile("(?<==) ?:[\\w_.]+, Pattern.MULTILINE) using regex lookbehind and assuming the param is followed after a =
Well the parameter is not always after = operator. You can have scenario where :parameter = 'ABC' (inversed).
So it's a lookbehind or lookahead. (lookbehind must be fixed length... can't include variable with space there) (?<==)\s*:[\w_.]+|:[\w_.]+(?=\s*=) see rubular.com/r/BmGCgJdZ4bZFkR
This one works better, but I'm still facing issues with function calls AND FUNC(:param) OR FUNC(0, :param) OR FUNC(:param, 0) Edit: The regex I've used has been matching everything propertly, but somehow yours does properly ignore literals. rubular.com/r/uZ9BsgOqpU0KkH
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.