1

I have a query :

select a.id, a.nama, b.alamat from tsql as a join tswl as b on a.id = b.id join tscl as c on a.id=c.id

And using regex I wanted to get only the table name from the query string.

I tried using the regular expression :

select.*from\s+(\w+).*join\s(\w+).*

Which managed to get the first table name after "from", but skipped the second table name, and get the third table name.
Current group : [Group 1 : "tsql", Group 2 : "tscl"]

Targeted group : [Group 1 : "tsql", Group 2 : "tswl", Group 3 : "tscl"] ... and so on until the last name of the chain join query

Any help will be apreciated!

1
  • 1
    You can't get 3 groups in the results since there are only 2 capturing groups in the regex. You need to match what you can and then split. Or, use optional groups, like select.*from\s+(\w+)(?:.*?join\s(\w+))?(?:.*?join\s(\w+))?, you may repeat them as many times as you wish. Commented Aug 1, 2018 at 12:30

3 Answers 3

2

Your current expression does not consider the possible repetition of the from/join keywords.

You could try this expression :

(?<=from|join)\s+(\w+)

Please try it yourself here : https://regex101.com/r/qQ1rvs/1

The positive lookbehind used here (?<=...) allows to capture any words (\w+) which are after it. In this case, every table names after these keywords are captured by this expression.

Sign up to request clarification or add additional context in comments.

Comments

1

Note: The regexps discussed here will not work on most versions of MySQL because of the lame regexp parser.

Issues:

  • White space
  • from or join being used in a comment or string or column name, etc. (I won't tackle this.)
  • Possible backtics around table name. -- handled, but with a bug
  • Punctuation inside backtica -- not handled
  • Possible setting that allows quotes around table name (not handled)
  • Subqueries/UNION -- The table names will be lumped together without regard for such.
  • Punctuation against from/join -- FROM a JOIN(b JOIN c) -- partially(?) Handled
  • Derived table FROM ( SELECT ... ) -- not handled
  • Case (from vs FROM) -- outside the scipe of this regexp
  • LEFT OUTER JOIN -- not a problem

So, this will mostly handle one table name:

\b(FROM|JOIN)\s+\`?(\w+)\`?

If JS has a way to iterate through a string, continuing where it left off, then you can get them all.

To better handle backtics and punctuation within:

\b(FROM|JOIN)\s+(\w+|`[^`]+`)

but you will need to strip the backtics when they occur.

Comments

0

You could try this:

var query = 'select a.id, a.nama, b.alamat from tsql as a join tswl as b on a.id = b.id join tscl as c on a.id=c.id';
console.log(query.match(/(from|join)\s+(\w+)/g).map(e => e.split(' ')[1]))

Output:

Array(3) [ "tsql", "tswl", "tscl" ]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.