MySQL select performance issue with JDBC

Question

I'm having a performance issue with MySQL (v5.5) while running a local Java application using mysql-connector v5.1.30

I have 2 tables called A and B. I run something like this:

PreparedStatement stmtA = conn.prepareStatement("select x, y, z, u, v, w from A");
PreparedStatement stmtB = conn.prepareStatement("select val from B where something = ?");
ResultSet result = stmtA.executeQuery();
while ( resultA.next() ){
    // do something with result row
    if (some condition) { // true approx. 80% of the time
        ResultSet resultB = stmtB.executeQuery(some_value_from_resultA);
        // do something with result row  
        resultB.close();
    }
}
stmtA.close();

The problem is that this takes ~12 minutes to complete every time. Here's a few relevant bits of information:

The mysqld process is using 95-99% CPU for the 12 minute duration
The java process is using 1-4% CPU for the 12 minute duration
Memory usage for both processes is negligable
Table A has 65,000 records, 18 columns (all VARCHAR except for Id) and uses MYISAM
Table B has 51,000 records, 4 columns (3 INT and 1 VARCHAR) and uses MYISAM
There is no index/key on the Table B column that I'm querying
I'm running Ubuntu 12.04 with a Core i5 and 4GB RAM

Aside from the fact that the CPU usage of the mysql process is crazy, the reason I think there is a problem is because the application has additional functionality that runs an almost identical method against a different database table. This table has a similar number of columns to table A (also using MYISAM) but one big difference is that there is no table B here. So it works the same as above but without having to query a second table while iterating the rows of the first table.

This other table has 4,500 rows and the iteration takes less than 1 second.

So it would appear that the nested query of table B is causing the problem, but I'm not sure. I have limited experience with MySQL. If you need any more info, please ask.

So, you're executing 0.8 * 65000 queries, on a table which has no index on the "something" column. No wonder it's slow. At least add an index. And try to use joins to execute much fewer queries instead of executing so many queries. — JB Nizet
– JB Nizet, Commented Jun 11, 2014 at 20:16
I knew there would be some performance penalty for doing that many queries, but I just didn't think it would be that severe. I'm thinking of just pulling the 2 columns that I'm interested in from "Table B" into a hash in memory (they're just integers). Can you suggest an alternative? Also, adding a table index is not an option for me. — RTF
– RTF, Commented Jun 11, 2014 at 20:23
You have a poorly designed database, a poorly designed application, no indexes, and a relatively large number of queries, and you're surprised that you have poor performance? Is there any reason that the query needs to be nested in the loop? — Noah
– Noah, Commented Jun 11, 2014 at 20:24
@Noah you know very little about my application or database to be making statements like that. There are indexes on tables, just not the column I have to query. Also, I didn't design the database, I'm just working with what I have — RTF
– RTF, Commented Jun 11, 2014 at 20:27
@Noah I need to nest it because I won't know what query parameter to use for the nested query until I've retrieved it from the current iteration of Table A — RTF
– RTF, Commented Jun 11, 2014 at 20:36

Eran · Accepted Answer · 2014-06-11 20:29:59Z

1

You might want to pre-load or cache the results of the queries you run on table B. Since table B doesn't have many records, you can afford to load it into a map.

You have two options :

pre-load the entire B table and create a Map of something (key) and val (value). Then access that map instead of running 0.8*65000 queries. That would be most effective if something has many unique values and your B queries search for many of them.
Store the result of each query on B in the map and check that map before running these queries. There's not point in loading the same data twice. That would be more efficient in terms of memory if you only use a small subset of the something values in your B query.

Adding an index on table B would also help.

edited Jun 11, 2014 at 20:29

answered Jun 11, 2014 at 20:24

Eran

395k57 gold badges726 silver badges793 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

RTF Over a year ago

I was just thinking something similar. I might go with your first suggestion. I only need two columns from Table B, and they're both integers. Plus there will be a lot of unique values that I need.

Eran Over a year ago

@RTF That should be efficient enough, as long as table B doesn't grow too much. And if you are going to query many unique values, an index wouldn't make much of a difference.

RTF Over a year ago

Tried loading it all into a hash and the application run now completes in 6 seconds. I didn't realize those queries would be so intensive. Thanks everyone!

Collectives™ on Stack Overflow

MySQL select performance issue with JDBC

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related