1

I have to read Employee data from a text file(each record is separated by tab) into a ArrayList. Then I have to insert this employee objects from list to the Employee table in DB. For this, I am iterating the list elements one by one and inserting Employee details one at a time into DB. This approach is not recommended performance wise because we can have more than 100k records and it will take so much time to insert the whole data.

How can we use multi threading here while inserting data from list to db to improve performance. Also how can we use CountDownLatch and ExecutorService classes to optimize this scenario.

ReadWriteTest

public class ReadWriteTest {

public static void main(String... args) {
    BufferedReader br = null;
    String filePath = "C:\\Documents\\EmployeeData.txt";
    try {
        String sCurrentLine;
        br = new BufferedReader(new FileReader(filePath));
        List<Employee> empList = new ArrayList<Employee>();

        while ((sCurrentLine = br.readLine()) != null) {
            String[] record = sCurrentLine.split("\t");
            Employee emp = new Employee();
            emp.setId(record[0].trim());
            emp.setName(record[1].trim());
            emp.setAge(record[2].trim());
            empList.add(emp);
        }
        System.out.println(empList);

        writeData(empList);

    } catch (IOException | SQLException e) {
        e.printStackTrace();
    }
}

public static void writeData(List<Employee> empList) throws SQLException {
    Connection con =null;
    try{  
        Class.forName("oracle.jdbc.driver.OracleDriver");  

        con=DriverManager.getConnection("jdbc:oracle:thin:@localhost:1521:xe","system","oracle");  
        for(Employee emp : empList)  
        {
        PreparedStatement stmt=con.prepareStatement("insert into Employee values(?,?,?)");  
        stmt.setString(1,emp.getId()); 
        stmt.setString(2,emp.getName());
        stmt.setString(3,emp.getAge());
        stmt.executeUpdate();   
        }         
        }catch(Exception e){ 
            System.out.println(e);
        }
        finally{
            con.close();
        }   
        }  
}

Employee Class

public class Employee {

String id;
String name;
String age;

public String getId() {
    return id;
}
public void setId(String id) {
    this.id = id;
}
public String getName() {
    return name;
}
public void setName(String name) {
    this.name = name;
}
public String getAge() {
    return age;
}
public void setAge(String age) {
    this.age = age;
}
@Override
public String toString() {
    return "Employee [id=" + id + ", name=" + name + ", age=" + age + "]";
}
}

EmployeeData.txt

1   Sachin  20
2   Sunil   30
3   Saurav  25

3 Answers 3

2

Direct import

An alternative to the Java application approach is a database approach. All major databases have tools which can import data directly from a text file to table.

Postgres has the COPY command. This can be run from the command line or from within SQL. See the wiki page for discussion.

Look at your database tool set.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes, such import tools like Postgres COPY can be very fast to execute. The data is more directly fed to the table without having to process your SQL statements. The limitation is that your data to be imported must closely match the structure of the targeted table as these tools typically provide little in the way of processing or alteration features, another reason for their speed. As a programmer I keep learning to think beyond my programming language (Java etc.) and leverage the powerful features available in a worthy database platform like Postgres.
1

I agree with @kuporific. Batch update will prove to be better from a performance point of view.

Give a try with the following edit to your code:

    public static void writeData(List<Employee> empList) throws SQLException {
    Connection con =null;
    final int BATCH_SIZE = 1000; // just an indicative number
    try{  
        Class.forName("oracle.jdbc.driver.OracleDriver");  
        con=DriverManager.getConnection("jdbc:oracle:thin:@localhost:1521:xe","system","oracle");  
        Statement statement = con.createStatement();
        int counter = 0;
        for(Employee emp : empList)  
        {
            String query = "insert into Employee (id, name, city) values('"
                    emp.getId() + "','" + emp.getName() + "','" + emp.getAge() + "')";
            statement.addBatch(query);
            if (counter % BATCH_SIZE == 0){
                statement.executeBatch();
            }
            counter++;  
        }

        statement.close();

        }catch(Exception e){ 
            System.out.println(e);
        }
        finally{
            con.close();
        }   
}  

Comments

1

Depending on your application, it might make sense to put the DB update code on a thread off of the main application thread. You can do this using Executors for example.

You could also look into using batch updates instead.

I would suspect that trying to update the DB on multiple threads isn't going to speed things up because the DB has to maintain atomicity, so any table can only be updated by one thread at a time anyway.

You could go really crazy and do both actions off of the main thread using Java 8's CompletableFuture:

CompletableFuture.supplyAsync(new Supplier<List<Employee>>()
{
    @Override
    public List<Employee> get()
    {
        List<Employee> employees = new ArrayList<>();
        // get employee list
        return employees;
    }
}).thenAcceptAsync(new Consumer<List<Employee>>()
{
    @Override
    public void accept(List<Employee> employees)
    {
        // put into DB using batching
    }
});

The first supplyAsyc will call the given code on another thread. When it completes, the return value is passed to the Consumer in the thenAcceptAsync, and that function is also run on another thread.

This can be written more compactly as:

CompletableFuture.supplyAsync(() -> {
    List<Employee> employees = new ArrayList<>();
    // get employee list
    return employees;
}).thenAcceptAsync(employees -> {
    // put into DB using batching
});

1 Comment

I have a similar use case, but mine is trying to get data from database and deliver it to the UI. I am new to CompletableFuture as I am trying to replace AsyncTask [also ExecutorService.submit() with .get()] to load a List of CardViews from Room database to the UI. Not having any luck, would appreciate any thoughts or ideas here: stackoverflow.com/questions/67294345/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.