I have a list that contains multiple dataframes. These dataframes can be quite large and take some time to write to csv files. I am trying to write them concurrently to csv files using pandas and tried to use multithreading to reduce the time. Why is the multithreading version taking more time than the sequential version? Is writing a file to csv with pandas not an IO Bound Process or am I not implementing it correctly?
Multithreading:
list_of_dfs = [df_a, df_b, df_c]
start = time.time()
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
results = executor.map(lambda x: list_of_dfs[x].to_csv('Rough/'+str(x)+'.csv', index=False), range(0,3))
print(time.time()-start)
>>> 18.202364921569824
Sequential:
start = time.time()
for i in range(0,3):
list_of_dfs[i].to_csv('Rough/'+str(i)+'.csv', index=False)
print(time.time() - start)
>>> 13.783314228057861