3

Hope someone can help me out with this one !

I have a sql file that looks like this:

CREATE TABLE IF NOT EXISTS users(
    id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    firstname VARCHAR(30) NOT NULL,
    lastname VARCHAR(30) NOT NULL,

    PRIMARY KEY (id),
    CONSTRAINT UNIQUE (firstname,lastname)
)
ENGINE=InnoDB
;

INSERT IGNORE INTO users (firstname,lastname) VALUES ('x','y');
/*
INSERT IGNORE INTO users (firstname,lastname) VALUES ('a','b');
*/

I have buit a web application that initializes a mysql database at startup with this function:

public static void initDatabase(ConnectionPool pool, File sqlFile){
    Connection con = null;
    Statement st = null;
    String mySb=null;
    try{
        con = pool.getConnection();
        mySb=IOUtils.copyToString(sqlFile);

        // We use ";" as a delimiter for each request then we are sure to have well formed statements
        String[] inst = mySb.split(";");

        st = con.createStatement();

        for(int i = 0; i<inst.length; i++){
            // we ensure that there is no spaces before or after the request string
            // in order not to execute empty statements
            if(!inst[i].trim().isEmpty()){
                st.executeUpdate(inst[i]);
            }
        }
        st.close();
    }catch(IOException e){
        throw new RuntimeException(e);
    }catch(SQLException e){
        throw new RuntimeException(e);
    }finally{
        SQLUtils.safeClose(st);
        pool.close(con);
    }
}

(This function was found on the web. Author, please forgive me for not citing your name, I lost it !!)

It works perfectly as long as there is not SQL comment blocks.

The copyToString() function basically does what it says. What I would like now is build a regex that will remove block comments from the string. I only have block comments /* */ in the file, no --.

What I have tried so far:

mySb = mySb.replaceAll("/\\*.*\\*/", "");

Unfortunatly, I'm not very good at regex...

I get all the troubles of "The matched string look something like /* comment */ real statement /* another comment*/ " and so on...

1
  • You need lazy operator ? in your regex Commented Apr 19, 2012 at 10:37

4 Answers 4

10

Try

mySb = mySb.replaceAll("/\\*.*?\\*/", "");

(notice the ? which stands for "lazy").

EDIT: To cover multiline comments, use this approach:

Pattern commentPattern = Pattern.compile("/\\*.*?\\*/", Pattern.DOTALL);
mySb = commentPattern.matcher(mySb).replaceAll("");

Hope this works for you.

Sign up to request clarification or add additional context in comments.

3 Comments

It seems to work for block comment that is on one line, but not for blocks which stand on multiple line !! Does the .*? also matches carriage return?
Ouch, you need to use a full-fledged Pattern instance using the Pattern.DOTALL flag. I will modify the answer shortly.
After having read the docs for Pattern class, it works, thanks a lot !!
2

You need to use a reluctant qualifier like this:

public class Main {

    public static void main(String[] args) {
        String s = "The matched string look something like /* comment */ real statement /* another comment*/";
        System.err.println(s.replaceAll("/\\*.*?\\*/", ""));
    }
}

Comments

2

Try the following approach:

String s = "/* comment */ select * from XYZ; /* comment */";
System.out.println(s.replaceAll("/\\*.*?\\*/", ""));

Outputs:

 select * from XYZ; 

The .*? stands for use Laziness Instead of Greediness (that means the .* matches the largest string possible by default, i.e. is greedy => you have to configure it to be non-greedy using the ? in the .*? expression).

1 Comment

thanks for the explanation of laziness and greediness. I at least learned something. But see my comment on Alexander Pavlov post please.
1

it won't work 100%

the comments can be a part of a valid string specified in the SQL and in that case they need to be kept...

I am just researching a solution... seems to be complicated

so far I have:

\G(?:[^']*?|'(?:[^']|'')*?'(?!'))*?\/\*.*?\*\/

but it matches all while I need to match the comment only... and just found out it could fail when preceded by a single-line comment... damn

1 Comment

ok, so now comments in strings won't be a problem anymore, but for the single-line comments before multi-line comments... I think a parser will really be easier: (?:[^']*?|'(?:[^']|'')*?'(?!'))*?\K\/\*.*?\*\/

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.