1

In my Java-application I need to store a big table on the harddisk as I want it to be persistent.

My first try was like this: (i & j can climb up to 300.000 and more, so I have an array of 300.000^2 double-entries, which crash my system.)

stmt.executeUpdate("DROP TABLE IF EXISTS calculations;");
stmt.executeUpdate("CREATE TABLE calculations (factorA, factorB, result);");
double temp = 0;
for (i = 0; i < datasource.size(); i++) {
    for (int j = 0; j < datasource.size(); j++) {
        if (i != j) {
            temp = calc(datasource.get(i),datasource.get(j));
            stmt.execute("INSERT INTO calculations (factorA, factorB, result) VALUES ('"+i+"','"+j+"','"+temp+"')");
        }
    }
}

Now, this performs extreme slow, probably because of the SQL-command which is a string etc.

My new guess is, that it's probably better to first calculate results for i.e. 10.000 i's and THEN store them as one unit into the database.

But before I try to implement, does anybody have a better idea? The database-useage is not mandatory, just an easy access and quick to implement.

4
  • Make sure to do your bulk inserts as a transaction; sqlite.org/lang_transaction.html - this will queue them up in the SQLite engine and then when you indicate the transaction is done, will commit them lightning fast, instead of doing them one by one. Commented Jan 12, 2014 at 16:10
  • Do you need fast write or just fast read? How many rows are you going to extract? Is the composition of factorA+factorB usable as index? Commented Jan 12, 2014 at 16:13
  • Actually I need both, fast writing AND reading, but at first writing. Commented Jan 12, 2014 at 16:15
  • OK, I use "PreparedStatement" now, it seems, that this performs quite well. Commented Jan 12, 2014 at 16:24

2 Answers 2

1

Try adding every n or so rows in inside one Transaction(assuming failure is not an issue, example if some rows fail to insert, you can go on without rolling back previous ones). Declare a counter oustide the loop:

int n = 1000; //commit every 1000 rows, or you can tweak
int count = 0; //counter for rows inserted

Start transaction in the outer loop. increase and check the counter in the inner loop

if(count % n == 0){
  //commit the transaction
}
count++

(Dont't forget to reopen Transaction again in the outer loop)

http://docs.oracle.com/javase/tutorial/jdbc/basics/transactions.html

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, this is actually what I'm doing right now. But still, reading is quite slow. I just think, when you need to access really BIG tables, SQLite is just not the way to go. But what else?!
Is your query slow, tried running in Sql IDE directly? Try pagination, reading/writing every some rows, that's memory efficient.
1
        int BLOCK_SIZE = 15000;
        stmt.executeUpdate("DROP TABLE IF EXISTS calculations;");
        stmt.executeUpdate("CREATE TABLE calculations (factor_idx text NOT NULL PRIMARY KEY,result text NOT NULL);");
        double temp = 0;
        int block_ctr = 1;
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < datasource.size; i++) {
            for (int j = 0; j < datasource.size; j++) {
                if (i != j) {
                    temp = calc(a, b);
                    // init the statement when counter = 1
                    if (block_ctr == 1) {
                        sb.append("INSERT INTO 'calculations' SELECT '" + i + "_" + j + "' AS 'factor_idx', '" + temp + "' AS 'result'");
                    }


                    // then commit only every BLOCK_SIZE blocks
                    if (block_ctr <= BLOCK_SIZE) {
                        sb.append("UNION SELECT '" + i + "_" + j + "','" + temp + "'");
                    } else {
                        stmt.execute(sb.toString());
                        sb.setLength(0); // better then creating a new sb 
                        block_ctr = 0;
                    }
                    block_ctr++;
                }
            }
        }

I reduced the number of columns and I created a composed statement using a StringBuilder. It should be a lot faster, allowing faster read using the index on the primary key column that you create concatenating i and j. Try and let me know, I'm curious :)

4 Comments

funny, this just stops, when i gets >1... Hmm, I need to figure out, why that happens.
When I reduce the BLOCK_SIZE to 500, I get this: [SQLITE_ERROR] SQL error or missing database (too many terms in compound SELECT). When I use 100 for BLOCK_SIZE it performs quicker, but not as quick as with preparedstatement :-D
I didn't knew about the limit of the terms in compound SELECT... I guess I learnt something :) I think my idea for faster read on a single indexed column will work better than selecting over both i and j. Use preparedStatement with the table structure I proposed.
using 8000x8000 for i and j reading performs great! This is my way to go... thanks a lot!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.