1

I want to parse a Teradata ANSI sql and get the join columns and tables used in the query.

For example below is the query

select * from
TABLEA A INNER JOIN TABLEB b ON A.COL1 =B.COL2 INNER JOIN TABLEc C ON A.COL1 =C.COL3

I want to get the result like below

Tables :tableA ,tableb, tablec columns = a.col1,b.col2,c.col3

I already have the code to get all the tablenames from a SQL. Iam using the below script from Paul to parse the tablename I want to modify the below script to get the join column names from the SQL

join_constraint = Optional(ON + expr | USING + LPAR + Group(delimitedList(column_name)) + RPAR)

join_op = COMMA | (Optional(NATURAL) + Optional(INNER | CROSS | LEFT + OUTER | LEFT | OUTER) + JOIN)

join_source = Forward()
single_source = ( (Group(database_name("database") + "." + table_name("table*")) | table_name("table*")) +
                    Optional(Optional(AS) + table_alias("table_alias")) +
                    Optional(INDEXED + BY + index_name("name") | NOT + INDEXED)("index") |
                  (LPAR + select_stmt + RPAR + Optional(Optional(AS) + table_alias)) |
                  (LPAR + join_source + RPAR) )

join_source << single_source + ZeroOrMore(join_op + single_source + join_constraint)

result_column = "*" | table_name + "." + "*" | (expr + Optional(Optional(AS) + column_alias))
select_core = (SELECT + Optional(DISTINCT | ALL) + Group(delimitedList(result_column))("columns") +
                Optional(FROM + join_source) +
                Optional(WHERE + expr("where_expr")) +
                Optional(GROUP + BY + Group(delimitedList(ordering_term)("group_by_terms")) +
                        Optional(HAVING + expr("having_expr"))))

select_stmt << (select_core + ZeroOrMore(compound_operator + select_core) +
                Optional(ORDER + BY + Group(delimitedList(ordering_term))("order_by_terms")) +
                Optional(LIMIT + (integer + OFFSET + integer | integer + COMMA + integer)))

tests = """\
    select * from xyzzy inner join dhfjfjfj on df = hfj  inner join abc on dg = hj inner join cde on cv = nj  where z > 100
    select * from xyzzy,hfhfhf  where z > 100 order by zz
    select * from xyzzy""".splitlines()
for t in tests:
    print t
    try:
        print select_stmt.parseString(t).dump()
    except ParseException, pe:
        print pe.msg
    print

1
  • Locate the elements in the grammar that are the parts you want to retrieve, and give them a name like "join_columns*". The trailing '*' will make sure that you get all the values, not just the last one. You should be able to take it from here. Commented Aug 31, 2016 at 23:28

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.