0

I have a functioning loop that runs a series of sequential tests on each of a list of elements. Unfortunately the list is growing quickly, as are the number of tests, and I'd like to see if I can add threading to it in such a way that it can perform the tests on several different elements simultaneously. The tests themselves must be run sequentially as each may rely on previous data returned by the tests to perform the next one, but each element is independent and doesn't require data from another element. Only after each element is complete would something need to be done on the entire dataset.

def do_some_tests_on_a_list_of_elements(element_list):
    do_some_stuff_here_to_set_up()

    for index, element in enumerate(element_list, start=1):
        element = do_some_stuff_on_element(index, element) 

    do_some_stuff_after_each_element_has_finished()

For example in this code, I'd want to do the setup, then allow the loop to process more than one element at a time via threading, and then after all elements are finished, do the final step on the dataset. What would be the simplest mechanism to achieve this?

1
  • 1
    It sounds like you want to process items in parallel on different CPUs. As I understand the terminology, this is different than running multiple threads on a single CPU. My suggestion is to search for the terms 'parallel processing', 'Python', and probably 'example' or 'tutorial'. I have done this some time ago Commented Apr 18, 2018 at 19:48

1 Answer 1

3

Like all questions with threading, to answer accurately, it is very important to know what exactly the tests do, as in, if they are using the CPU to compute something, or if they are using the network.

Python's most common implementation (CPython) has a global interpreter lock (GIL) that prevents threadings from running python code in parallel. So if your tests are CPU-based, they can't run at the same time using threads. You have to use processes to do it.

If your tests on the other hand are IO-based (waiting for network packets) then you could use threads, but it is better to just use asynchronous requests insteads. You don't need extra threads to wait for network, you can just not wait.

Sign up to request clarification or add additional context in comments.

1 Comment

Sounds like it would be parallel processing then as opposed to multithreading as the tests are not merely waiting on network IO. Thanks for the clarification.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.