0

I have written a function create_time_series(input_df1, info_df1, unit_name,start_date,end_date), which aims to create a time series based on log-files saved in input_df1. The problem of my function is that the execution is slow, therefore I thought of parallelizing it.

The following code is my attempt at utilizing the multiprocessing library:

if __name__ == '__main__':
arg = corrected_data,block_info,(unit for unit in block_info.UnitID.unique()),"2015-01-01","2021-12-31"
with Pool(processes = 16) as pool:
    temp_data = pool.starmap(create_time_series,arg)
    out_data = pd.concat([out_data,temp_data[unit]],axis =1)

In the task manager, I can see the processes running however, those seem to be idling. Hence my question, what did I do wrong in attempting to parallelize the task ?

0

1 Answer 1

1

You are not splitting your load, and giving the process pool only one item to process (arg). Check the documentation for starmap: it expects an iterable (e.g. list) of tuples, each of which has all the required arguments

Sign up to request clarification or add additional context in comments.

4 Comments

First of all, thanks for the response. I thought the arg would be passed as an iterable, because of the (unit for unit in block_info.UnitID.unique()) part, as you said it is not. How can I pass the parameters onto the process pool for unit while keeping the other parameters constant ?
@Guilly if that's what you want you can do [(corrected_data, block_info, unit, "2015-01-01", "2021-12-31") for unit in block_info.UnitID.unique()]
I assume, that I should replace this arg term with the [(corrected_data, block_info, unit, "2015-01-01", "2021-12-31") for unit in block_info.UnitID.unique()]. Since the new term contains an explicit iterable. To be fair, I am rather confused, this is specifically what arg contains.
@Guilly no it is not. As you wrote it, arg is something like (corrected_data, block_info, (unit_1,unit_2,unit_3), "2015-01-01", "2021-12-31"). The other comprehension that I showed you will create something like this: [(corrected_data,block_info,unit1,"2015-01-01", "2021-12-31"),(corrected_data,block_info,unit_2,"2015-01-01", "2021-12-31"),(corrected_data,block_info,unit_3,"2015-01-01", "2021-12-31")]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.