1

Working on a system to support multiple database queries in parallel. Considering there is a lot of data to query from each, the requirement is to keep each database query separated from others. Meaning, load on one database/table should not have impact on other Table queries. I developed a solution in Java using ExecutorService. Using one ExecutorService(Fixed size with 1 Thread) per Database. I maintain a map of DB name TO ExecutorService and direct the calls to respective executor service on receiving query requests. Considering there can be one hundred databases being queried in parallel, not sure if ExecutorService is the right choice...! I have done some valuation and initial results look okay. One challenge I have with this solution is, as I am creating ExecutorServices dynamically, it's getting tough for me to shutdown them gracefully when application stops.

Other ways to tackle this problem is to maintain a global(meaning, across all Databases) pool of query worker threads, and reuse them in random for incoming requests. But, this will not guarantee all Database queries are given equal priority.

DatasetFactory.java

public class DataSetExecutorFactory {

        private static Map<String, DataSetExecutor> executorMap = Collections.synchronizedMap(new HashMap<String, DataSetExecutor>());
    public static DataSetExecutor getDataSetExecutor(String dbName){
            DataSetExecutor executor = null;

            executor = executorMap.get(dbName);
            if(executor == null){
                executor = new DataSetExecutor(dbName);
                executorMap.put(dbName, executor);
            }
            return executor;
        }
    }
}

DataSetExecutor.java

public class DataSetExecutor {

    private ExecutorService executor = Executors.newFixedThreadPool(1);
    public List<Map<String, Object>> execQuery(String collecName, Map<String, Object> queryParams){
        //Construct Query job. 
        //QueryWorker extends 'Callable' and does the actual query to DB
        QueryWorker queryWorker = new QueryWorker(Map queryParams);

        Future<QueryResult> result = null;
        try{
            result = executor.submit(queryWorker);
        }catch (Exception e){
            //Catch Exception here
            e.printStackTrace();
        }
    }
6
  • What is your question? Commented Mar 9, 2018 at 20:01
  • I think this is a valid solution, although your queries won't run in parallel if you are pushing all onto a single thread executor service. Commented Mar 9, 2018 at 20:12
  • Why an executor for each database instead of 1 executor (with more than 1 thread in the pool) which handles any queryWorker? Pass the data source to the QueryWorker to let it create the connection. Commented Mar 9, 2018 at 20:56
  • @PatrickMevzek Is there an alternative or better way to achieve this use case ? Commented Mar 10, 2018 at 1:01
  • @SamOrozco Yes, sequential execution of queries with in an Executor service per DB is find with me. Commented Mar 10, 2018 at 1:02

1 Answer 1

1

I think your misunderstanding how ExecutorService Works. Rather than creating an ExecutorService for each Database, You should make a single ExecutorService as a FixedThreadPool of size n (n = # of databases or # of max parallel queries). The Thread pool will do the parallel processing work for you. You simply need to track the database name as part of your QueryWorker that will be submitted to the ExecutorService.

This also makes shutdown easy as the ThreadPool will automatically clean up unused threads and you only need to shut it down once when the application closes.

All that being said though, Since all this parallel processing is happening in the same JVM and on the same Machine, You might run into Memory or CPU limitations depending on how intense your querying is.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your response. Reason, I preferred DB-->ExecutorService is to isolate Traffic by DB. It's possible that request queue might get filled up with queries from one DB, and don't want that to affect the query response time for other DB queries. Good point on the Memory/CPU limitation, will account for that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.