0

I have following query that returns data between a date ranges.

dates = ['20100101','20100201',20100301','20100401']

query = 'select date, company_name, total_amount from info_stats where date between 'start_date' and 'end_date'

I obtain date ranges from another process and construct a list so that I can iterate as follows:

pds = []
for idx in range(0, len(dates) - 1):
   formated_query = self.get_formated_query(start_date=dates[idx].strftime('%Y%m%d'),
                                                      end_date=dates[idx + 1].strftime('%Y%m%d')
   results_df = pds.append(pd.read_sql(sql=formated_query,con=db_connect))

to query I am passing date at index and index + 1 (date greater than date at the index)

These queries take super long time and i want to execute in a parallelize manner so that wait time is shorter. I went over joblib but not sure if this is multi-threading or multi-processing. Looks like the former. Also new to joblib, how can I parallelize above code using joblib or other package?

1

1 Answer 1

0

The question is quite broad, but I can share my own experience with parallelising queries against databases.

What I found is that if I have many small jobs, I can use the python built-in multithreading modules such as concurrent.futures. I will get speed ups.

However, if I have big jobs that take a long time to run on the database, parallelising does not help. This is because the database engine itself (in my case SQL Server), already does a splendid job of parallelising the job. In which case, the single big job is already maximising the number of processes the server can handle - putting more jobs won’t help. Your situation seems to be the this one.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.