OK, this might seem too tough to be posted here so I beg your pardon. Been working on this for almost a week.
I need to extract all selected columns in a given Oracle SQL String. It should pass the following test cases:
// single column test
select col1 from dual
// ^ should match "col1"
// multiple column test
select col1,col2 from dual
// ^ should match "col1", "col2"
// multiple space test
select col1 , col2 from dual
// ^ should match "col1", "col2"
// "distinct" tests
select distinct col1 from dual
// ^ should match "col1"
select distinct col1, col2 from dual
// ^ should match "col1", "col2"
// "distinct" with whitespaces tests
select distinct col1 from dual
// ^ should match "col1"
select distinct col1 , col2 from dual
// ^ should match "col1", "col2"
// "as" tests
select col1 from dual
// ^ should match "col1"
select colA as col1 from dual
// ^ should match "col1"
select colA as col1, col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select col1, colB as col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select col1, col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// "as" tests with whitespaces tests
select colA as col1, colB as col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// "distinct" with "as" tests
select distinct colA as col1 from dual
// ^ should match "col1"
select distinct colA as col1, colB as col2, col3 from dual
// ^ should match "col1", "col2", "col3"
select distinct colA as col1, col2, colC as col3 from dual
// ^ should match "col1", "col2", "col3"
// function test
select funct('1','2') as col1 from dual
// ^ should match "col1"
select col1, funct('1','2') as col2 from dual
// ^ should match "col1", "col2"
select col1, colB as col2, funct('1','2') as col3 from dual
// ^ should match "col1", "col2", "col3"
I tried the following RegEx in Java
((?<=select\ )(?!distinct\ ).*?(?=,|from))
((?<=select\ distinct\ ).*?(?=,|from))
((?<=as\ ).*?(?=,|from))
((?<=,\ ).*?(?=,|from))(?!.*\ as\ ) // <- Right, I'm guessing here
OR-ed them together but I can't simply pass all the test cases above. (I'm using this tool to validate my Regex).
I tried searching for SQL evaluator but can't find any that extracts all columns without executing it against a real database and that assumes all referenced tables and functions exist.
A Java ReGex, a free SQL Evaluator (that doesn't need a real database) that can pass the tests, or anything better that those two are the acceptable answers. The assumption is that the SQL is always in Oracle 11g format.
tablename.columnnameand using quotation marks to make column names with spaces, or do you know that your select statements have a more constrained syntax?