Python-Parsing a SQL using pyparsing

Question

I want to parse a complex SQL which has (inner join,outer join) and get the table names used in the SQL.

I am able to get the table names if it is simple select but if the SQL has inner join ,left join like below then the result is giving only the first table.

select * from xyz  inner join dhf  on df = hfj  where z > 100

I am using the program similar what is present in the below link by Paul.

http://pyparsing.wikispaces.com/file/view/select_parser.py/158651233/select_parser.py

Can someone tell me how to get all the tables used in a SQL like below

select * from xyz  inner join dhf  on df = hfj  where z > 100.

This may be a duplicate of stackoverflow.com/q/35295458/409172 That solution requires a live database and a PL/SQL stored procedure to do most of the work, I'm not sure if that's feasible for you. But that's probably the only way to correctly parse complex SQL. Even non-trivial Oracle SQL is almost impossible to parse. With 2175 keywords, most of them not reserved, parsing Oracle SQL is a huge task. That's why you need a shortcut, like using the EXPLAIN PLAN method in that answer. — Jon Heller
– Jon Heller, Commented Aug 30, 2016 at 6:22
Pyparsing is no longer hosted on wikispaces.com. Go to github.com/pyparsing/pyparsing — PaulMcG
– PaulMcG, Commented Aug 27, 2018 at 12:43

PaulMcG · Accepted Answer · 2016-08-29 23:44:21Z

1

This parser was written a long time ago, and handling multiple values in a results name did not come along until later.

Change this line in the parser you cited:

single_source = ( (Group(database_name("database") + "." + table_name("table")) | table_name("table")) +

to

single_source = ( (Group(database_name("database") + "." + table_name("table*")) | table_name("table*")) +

When I run your sample statement thru the select_stmt parser, I now get this:

select * from xyz  inner join dhf  on df = hfj  where z > 100
['SELECT', ['*'], 'FROM', 'xyz', 'INNER', 'JOIN', 'dhf', 'ON', ['df', '=', 'hfj'], 'WHERE', ['z', '>', '100']]
- columns: ['*']
- table: [['xyz'], ['dhf']]
  [0]:
    ['xyz']
  [1]:
    ['dhf']
- where_expr: ['z', '>', '100']

answered Aug 29, 2016 at 23:44

PaulMcG

64.1k16 gold badges98 silver badges135 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user6771430 Over a year ago

Thanks Paul for the reply it is exactly working as expected .Iam getting all the table names from the SQL.

user6771430 Over a year ago

Is it possible to get all the join columns also from the SQL query?

dfrankow Over a year ago

Why "table*"? I can sort of tell it allows more than one, but I can't find any docs on it.

PaulMcG Over a year ago

The behavior is described in the docs for setResultsName (pythonhosted.org/pyparsing/…) under the listAllMatches argument, and then the '*' is explained in the __call__ docstring (pythonhosted.org/pyparsing/…)

M T Head · Accepted Answer · 2016-08-29 19:44:39Z

-1

Your question is going to depend on what Sql platform you are using.

I will answer assuming you are using MsSql. The same logic should be able to be done on all Sql platforms thought the syntax changes though.

Tables are unique by a combination of Owner and Table. I do a select that returns #Owner#TableName# in a Python script that I wrote to extract all data in all tables to text files. The basic form of this assuming you do not have multiple tables of the same name with a different owner is:

Select name from SysObjects where xtype = 'U' order by name

This gives you a list of all tables. Then you take that list and do a "Select * from [table name from other query]" looping through till you have all the tables that you found when you selected from Sysobjects.

Same type of thing is practical on all Sql Platforms assuming you have access to the system tables.

answered Aug 29, 2016 at 19:44

M T Head

1,3201 gold badge10 silver badges14 bronze badges

2 Comments

M T Head Over a year ago

Selecting select * from syscolumns can give you the column names.

PaulMcG Over a year ago

You misread the question. The OP does not want to query the db for the table names, they want to extract the table names from the posted SQL statement.

Collectives™ on Stack Overflow

Python-Parsing a SQL using pyparsing

2 Answers 2

4 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related