Split multiple SQL statements into individual SQL statements

Question

Intro note: I'm hoping a library or routine exists to do this, but I haven't been able to find anything like this. I'm really looking for direction and advice on where to start...

Here is the situation: I have a block of SQL commands coming as plain text. It might be one SQL commands or several. I need a way to split multiple SQL commands so I can run them one at a time. Microsoft SQL Management Studio does has this behavior out of box.

I'm trying to add this functionality to a PHP5/MySQL5 application running on Apache (Debian).

Some important points:

I really do need to run them one at a time. Seriously.
I don't want to require the user to enter a semi-colon after each SQL statement.
SQL statements can be on one or multiple lines, so I can't wrap on LBs/CRs
It needs to support SELECT, UPDATE, INSERT, DELETE at least.
It needs to support queries that are sub-selects
Neatly tabbed SQL needs to work
(In the interest of usable software) I do not want to force the user to enter in any kind of delimiter.

Here is an example block of SQL I need to split into two statements:

select sMessage, 
(
    SELECT COUNT(sTag) FROM Tags WHERE ixTicket = note.ixTicket
) FROM note
select * from ticket
    WHERE (SELECT MAX(nCount) FROM Counter WHERE ixTicket = ticket.ixTicket) > 5

I tried some RegEx attempts, but that doesn't seem to be powerful enough.

Any recommendation on an approach to tackle this problem?

Point 7 is really making this a bear ... even Oracle and SQL Server require some type of delimiter between statements (;, GO, \, etc). This is EXCEEDINGLY difficult w/o a delimiter (e.g. just think about all the places a SELECT can go - plus you have UNIONs and similar statements to contend with — Beep beep
– Beep beep, Commented Mar 11, 2009 at 3:34
BTW - SQL Server Management studio does require a "GO" or ";" between multiple statements ... just not for 1. You're asking for an easy way to do something so difficult that even MS and Oracle don't provide it. — Beep beep
– Beep beep, Commented Mar 11, 2009 at 5:17
@LuckLindy : "SQL Server Management studio does require a "GO" or ";" between multiple statements ... just not for 1." That is actually wrong. Just FYI. Open SQL Studio and check it out. — Justin
– Justin, Commented Apr 24, 2009 at 2:09

bobince · Accepted Answer · 2009-03-11 03:46:01Z

3

I'm not sure this is possible at all. You would certainly need an in-depth knowledge of the SQL syntax of your target DBMS. For example just off the top of my head this is a single MySQL statement:

INSERT INTO things
SELECT * FROM otherthings ON DUPLICATE KEY
UPDATE thingness=thingness+1

It is likely there are constructs in some DBMSs that, without a delimiter, could be ambiguous.

I don't want to require the user to enter a semi-colon after each SQL statement.

I think you may be forced to. It's totally the standard way to delimit SQL statements. Even if you can find a heuristic to detect probably-start-of-SQL-statement points, you risk disasters like an accidental “DELETE FROM things”-without-WHERE-clause.

SQL statements can be on one or multiple lines, so I can't wrap on LBs/CRs

Would double-newline-for-new-statement be acceptable?

I tried some RegEx attempts, but that doesn't seem to be powerful enough.

No, even with semicolon delimiters, regex is nowhere near powerful enough to parse SQL. Problem points would include:

';'
";"
`;`
'\';'
''';'
-- ;
#;
/*;*/

and any interposition of these structures. Eek!

edited Mar 11, 2009 at 3:46

answered Mar 11, 2009 at 3:38

bobince

538k111 gold badges675 silver badges846 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Justin Over a year ago

All good points, but I don't want to require a delimiter. It's very possible and safe to parse it without, just look @ SQL Management Studio. I didn't say it was going to be easy.

Craig · Accepted Answer · 2009-03-11 01:50:29Z

1

Maybe try this library. I have used it successfully for parsing sql in the past. http://www.sqlparser.com/

answered Mar 11, 2009 at 1:50

Craig

36.9k35 gold badges122 silver badges203 bronze badges

1 Comment

Justin Over a year ago

I'll check this out. I need to do this in code, so I'm not sure that would work, but I'll look tomorrow.

Jonathan Leffler · Accepted Answer · 2009-03-11 04:36:16Z

1

To add a quirk to the discussion that periodically causes issues:

DECLARE c CURSOR FOR
    SELECT * FROM SomeWhere ...
        FOR UPDATE

The trailing UPDATE tends to throw ad hoc parsers off their stride. It may well be that you don't have to worry about that because the DECLARE notation (which is really Embedded SQL, not plain SQL) is not permitted in the first place. But the FOR UPDATE clause can appear in some dialects of SQL even when not in a DECLARE statement, so beware.

answered Mar 11, 2009 at 4:36

Jonathan Leffler

759k145 gold badges961 silver badges1.3k bronze badges

Comments

mhoms · Accepted Answer · 2010-05-12 12:36:03Z

1

maybe with the following Java Regexp? check the test...

@Test
public void testRegexp() {
    String s = //
        "SELECT 'hello;world' \n" + //
        "FROM DUAL; \n" + //
        "\n" + //
        "SELECT 'hello;world' \n" + //
        "FROM DUAL; \n" + //
        "\n";

    String regexp = "([^;]*?('.*?')?)*?;\\s*";

    assertEquals("<statement><statement>", s.replaceAll(regexp, "<statement>"));
}

answered May 12, 2010 at 12:36

mhoms

3092 silver badges6 bronze badges

Comments

Jeroen Vermeulen · Accepted Answer · 2011-12-10 16:52:15Z

1

    $sMultiQuery = 'SHOW TABLES; SELECT * FROM `test`';
    $aQueries    = array();

    if ( preg_match_all('/([^;]*?((\'.*?\')|(".*?"))?)*?(;\s*|\s*$)/', $sMultiQuery, $aMatches) )
    {
        $aQueries = $aMatches[0];
    }
    else
    {
        $aQueries = array($sMultiQuery);
    }

    foreach ( $aQueries as $sQuery )
    {
        # Do your thing
    }

answered Dec 10, 2011 at 16:52

Jeroen Vermeulen

111 bronze badge

Comments

Scott Vance · Accepted Answer · 2024-02-05 04:56:18Z

If anybody needs a javascript / typescript function to split SQL statements here you go

const splitSqlStatements=(statements:string):string[]=>{
    const split:string[]=[];
    let inside:'"'|"'"|'-'|null=null;
    let s=0;
    for(let i=0;i<statements.length;i++){
        const char=statements[i] as string;

        if(inside){
            if(inside==='-'){
                if(char==='\n'){
                    inside=null;
                    s=i+1;
                }
            }else if(char===inside){
                if(statements[i+1]===inside){
                    i++;
                }else{
                    inside=null;
                }
            }
        }else if(char==='"'){
            inside='"';
        }else if(char==="'"){
            inside="'";
        }else if(char==='-' && statements[i+1]==='-'){
            const v=statements.substring(s,i).trim();
            if(v){
                split.push(v);
            }
            inside='-';
            i++
        }else if(char===';'){
            const v=statements.substring(s,i).trim();
            if(v){
                split.push(v);
            }
            s=i+1;
        }
    }
    if(s!==statements.length-1 && inside!=='-'){
        const v=statements.substring(s).trim();
        if(v){
            split.push(v);
        }
    }
    return split;
}

Beep beep · Accepted Answer · 2009-03-11 01:47:26Z

0

Your best bet is to require the user to put some type of deliminator between statements. For example: require each statement be delineated with a line containing only the word GO, or a "\", or end each statement with a ";".

This way you can easily break the single string into separate SQL statements.

answered Mar 11, 2009 at 1:47

Beep beep

19.2k12 gold badges68 silver badges78 bronze badges

1 Comment

Justin Over a year ago

Added point 7. Forgot to mention I definitely don't want delimiter.

Nick Josevski · Accepted Answer · 2009-03-11 01:52:33Z

0

If you don't want your users to put in a delimiting character such as ';' or any thing else, you will need to parse the input yourself and have logic to determine where statements begin.

Your logic will need to deal with the obvious query starting keywords 'SELECT', 'UPDATE', 'INSERT', 'DELETE' and work forward to the next keyword (or end of input).

answered Mar 11, 2009 at 1:52

Nick Josevski

4,2364 gold badges45 silver badges65 bronze badges

2 Comments

Craig Over a year ago

I have recently just worked on a SQL Parser. Initially I thought like you did that it should be relatively straight forward but don't be fooled. Even with the help of a SQL Parser third party component I still had to write 600 lines of code to do some pretty simple parsing.

Justin Over a year ago

Yah, I know I can write my own routine to do this. But it's hideous every time I try and fails.

Craig T · Accepted Answer · 2009-03-11 01:58:45Z

0

Have you tried using the keywords 'SELECT', 'UPDATE', 'INSERT' and 'DELETE' combined with counting the number of opening '(' and closing braces ')' ?

This should allow you to determine avoid nested SELECT statements and find the correct end of the statement.

answered Mar 11, 2009 at 1:58

Craig T

2,7725 gold badges25 silver badges34 bronze badges

1 Comment

Justin Over a year ago

Yes, I did. The code got hideous and long, and I constantly found use cases that broke it, so I figured there must be a more flexible way.

kquinn · Accepted Answer · 2009-03-11 02:08:32Z

0

You need to require the semicolon delimiter. Technically, without it a SQL statement is completely invalid; anyone omitting it is writing malformed SQL. Requiring the semicolon solves all of your problems, in a standardized way, and makes the software easy to write.

Perhaps do the following: if the user enters a query not containing one or more semicolons (outside of quotes, of course), add a semicolon at the end and run it as a single query. Otherwise, split the entered queries at semicolons and run each one individually, perhaps tacking on a semicolon at the end of the final query if omitted.

This solution is easy to write, SQL standard compliant, and just plain works. Not requiring the delimiter is a sure path to madness.

answered Mar 11, 2009 at 2:08

kquinn

10.8k5 gold badges38 silver badges35 bronze badges

7 Comments

Justin Over a year ago

"Not requiring the delimiter is a sure path to madness." I totally disagree. That's one of the best features in MS SQL Management Studio.

kquinn Over a year ago

If LuckyLindy's comment above is to be believed, even SQL Management Studio uses the approach I describe. Not requiring delimiters will require you to write a full SQL parser, as complex as the server's itself. Don't do it. 'Saving' users the 'trouble' of semicolons will only hurt in the long run.

Justin Over a year ago

@kquinn - SQL Server Management Studio does not require a delimiter.

kquinn Over a year ago

You're still crazy for demanding this. The alternative is simple and easy. Your demand for no delimiters is difficult, error-prone, and fragile.

Justin Over a year ago

@kquinn. Respectfully, I disagree. SQL Server Management Studio does not require delimiters and it is not an error-prone or fragile application. The idea is to design software for how people will tend to behave, not how I want them to behave. Look, I'm not saying this is easy, and I'm certainly not going to try to write my own SQL parser in PHP, but it's still worth trying to find a way.

|

MikeW · Accepted Answer · 2009-03-11 02:21:58Z

-1

You could parse it yourself I suppose. Look for the keywords SELECT, DELETE, UPDATE, INSERT, EXEC, etc.

As you parse, if you encounter a "(" increment a counter: nest_level++

If you encounter a ")" decrement nest_level--

Then when you come across a keyword, and nest_level == 0, then you've come to the next statement.

You'll also have to handle cases like

 INSERT ...
 SELECT ....

So for an INSERT you would have to look for either SELECT or VALUES...

And no doubt other cases.

Agree with kquinn you should just require the semicolon. I don't think there's anything "uncool" about that.

answered Mar 11, 2009 at 2:21

MikeW

5,9221 gold badge37 silver badges43 bronze badges

1 Comment

Justin Over a year ago

Yah, these are all the traps that I caught in trying to write my own algorithm.

Collectives™ on Stack Overflow

Split multiple SQL statements into individual SQL statements

11 Answers 11

1 Comment

1 Comment

Comments

Comments

Comments

Comments

1 Comment

2 Comments

1 Comment

7 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

1 Comment

1 Comment

Comments

Comments

Comments

Comments

1 Comment

2 Comments

1 Comment

7 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related