1

I have a loop that takes 21 minutes to run because of for loop. It is not complicated loop but I need to iterate all that many times.

My code looks like this

results=[]
for i in range(len(files)) ### around 5000 files with 96 rows each that evaluates
    results_f= function(arg1[i], arg2)
    results= results.append(results_f)

So how can I make that with multi threading?

I have tried something like

with concurrent.futures.ThreadPoolExecutor() as executor:    
    for i in range(len(files)):
        results = executor.map(function, [arg1[i],arg2])

that I saw working somewhere but it is not working at all

1
  • There's a syntax error in your code which makes this a really bad example. Also, you're indexing arg1 with an index valid for files. Further, "not working at all" is not a problem description. Please try out any actual example code that works and then adapt it to your case. If that causes problems, be specific! At the moment, your Q could be summarized "please teach me multithreading", which is not a valid topic here. Commented Mar 17, 2022 at 8:12

1 Answer 1

1

Not sure that it will help you, but that is what I came up with by myself.

First of all executor.map(funct, *iterable, ...) takes an iterables as input. It means you need to pass to lists with the same len() as arguments, since it splits this iterables into chunks and passes them to function as arguments. First iterable will be the first argument on each call function, second will be the second =).

Secondly, since it splits inputs into chunks you can get rid of the for loop.

And finally, it will return the generator. What to do with it decide by yourself.

with ThreadPoolExecutor(max_workers=4) as executor:
     results = executor.map(function_name, arg1, [arg2 for i in range(len(arg1)))

list(results)

Link to the documentation.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your answer but results is not the result from the loop I wrote in the question, it is as you said a generator but I don't know how to turn that into my numerical result
list(results) in my example forces it to run. Also, you may want to use ProcessPoolExecutor to really have parallelism

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.