2

Hi I am trying to fetch 50K + rows from one of the table in MYSQL DB. It is taking more than 20 minutes to retrieve all the data and writing it to text file. Can I use multi threading to reduce this fetching time and make the code more efficient. Any help will be appreciated.

I have used normal JDBC connection and ResultSetMetaData to fetch rows from the Table.

String row = "";
    stmt = conn.createStatement();
    ResultSet rs = stmt.executeQuery("select * from employee_details");
    ResultSetMetaData rsmd = rs.getMetaData();
    int columnCount = rsmd.getColumnCount();
while (rs.next()) {
        for (int i = 1; i < columnCount; i++) {
            row = row + rs.getObject(i) + "|";// check
        }
        row = row + "\r\n";
    }

And I am writing the fetched values in text file as below.

BufferedWriter writer = new BufferedWriter(new FileWriter(
    "C:/Users/430398/Desktop/file/abcd.txt"));
    writer.write(row);
    writer.close();
12
  • 1
    I don't think that multithreading will speed this up at all. Think of where and what the bottle neck is -- will threading affect this? Commented Nov 25, 2016 at 12:49
  • But there's no harm in trying it and seeing what happens -- what did you find with your experimentation with this? Commented Nov 25, 2016 at 12:49
  • I have tried using simple JDBC connection and resultmetadata to fetch the values. It is taking 20 minutes. I am not sure whether multi threading will work here or not Commented Nov 25, 2016 at 13:15
  • 1
    Again, I don't think so, since the bottle neck I believe is in fetching the data from the disk, and I don't see how threading will speed that up, but let's see what the experts have to say. Commented Nov 25, 2016 at 13:25
  • 1
    please look up SELECT INTO OUTFILE Commented Nov 25, 2016 at 13:48

2 Answers 2

1

Remember that rs.next will fetch Results from the DB in n-batches. Where n is a number defined by the JDBC-Implementation. I assume it's at 10 right now. So for every 10 batches it will again query the DB, hence there'll be an network-overhead - even if it's on the very same machine.

Just increasing that number will result in a faster loading time.

edit: adding this stmt.setFetchSize(50000); might be it.

Be aware, that this results in heavy memory consumption.

Sign up to request clarification or add additional context in comments.

6 Comments

I have tried adding stmt.setFetchSize(50000); but still it took 19 minutes to fetch the data.
@BasheerAhmed: but that's better than 20+ minutes.
@HovercraftFullOfEels Sorry for the confusion. 20+ minutes was around 21 minutes and 45 seconds.
you might also have to play around a bit. You can set it to another number as well. try 1000 for the beginning.
Fetching more rows with one query should speed it up for sure.
|
0

First you need to identify where the bottleneck is. Is it the SQL query? Or the fetching of the rows via the ResultSet? Or the building of the huge string? Or perhaps writing the file?

You need to measure the duration of the above mentioned individual parts of your algorithm and tells us the results. Without this knowledge is not possible to tell how to speed the algorithm.

4 Comments

Fetching the data using ResultSet is taking much time than writing it into the file. I am thinking of using batches(using OFFSET) in SQL to reduce the fetch time.
Did you try the fetching WITHOUT building the huge string?
The problem was with the building of huge String. I have used StringBuilder and the data is fetched within 2 seconds.
I would then suggest that you accept my answer - I consider my answer actually canonical in given context, in a sense that it makes clear it has to be measured first before it can be optimized. And of course, that the measurement needs to make sure it is only one aspect beeing measured.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.