Python: Using threads for processing jobs

Question

There is a pretty big multithreaded python2 web-application. In main thread works business logic, in sub-threads mostly database operations running. No TreadPoolExecutor is used now, and it cannot be implemented in the nearest future. I want to add another thread which is supposed to process certain amount of data (fast) and store the result into the DataBase (io-operation). This operation won't be executed very often.

So, the question is: should I run mostly sleeping thread and wait for an event to process data or maybe it's better to spawn new thread from the main when there is enough data and close it when processing were completed? Note, that there are already pretty large amount of threads running for GIL to switch between them.

Thanks.

Well, actually i'm SURE that subprocesses will do much better :) Though cause of DB operations in threads, there is a performance profit. For now we just cannot get rid of threads cause application is very big and old and mostly has a poor design. Also i'm just wondering. ;) — sunrize531
– sunrize531, Commented Jul 21, 2014 at 10:05

Aaron Digulla · Accepted Answer · 2014-07-21 09:21:25Z

1

If you run this process, say, once a day, then the overhead to create the thread and to destroy it is negligible.

A thread that is waiting for a signal (like a message in a queue) doesn't need the CPU, so it doesn't cost you to keep it hanging around.

That means you can look at the other design factors: Error handing, stability, code complexity.

If you have the error handling nailed down, keeping the thread alive is probably better, since that will handle a corner case for you: Accidentally running two instances at the same time.

If the thread can stall or you have problems with deadlocks and things like that, then it's better to kill any existing worker thread and start a clean one.

answered Jul 21, 2014 at 9:21

Aaron Digulla

330k111 gold badges626 silver badges840 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

sunrize531 Over a year ago

I'm aware of the factor that GIL constantly switching between various threads (say after 100 bytecodes processed), even if those threads are sleeping. This can decrease overall performance cause the application already has large amount of threads.

Aaron Digulla Over a year ago

@sunrize531: Do you have references?

Aaron Digulla Over a year ago

Also, have you considered to run the task in a new (forked) process?

Collectives™ on Stack Overflow

Python: Using threads for processing jobs

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related