2

I am trying to upgrade a python script that runs an executable on windows and manages the text output files to a version that uses multiple threaded processes so I can utilize more than one core. I have four separate versions of the executable which each thread knows to access. This part works fine. Where I run into problems is when they are running simultaneously and try to open the (different) output files to ensure they ran correctly and react depending on the contents of the output file.

Specifically, when running three threads, two will crash with the following error, while one continues to work:

Exception in thread Thread-4:
Traceback (most recent call last):
  File "C:\Python27\lib\threading.py", line 552, in __bootstrap_inner
    self.run()
  File "E:\HDA\HDA-1.0.1\Hm-1.0.1.py", line 782, in run
    conf = self.conf_file(Run)
  File "E:\HDA\HDA-1.0.1\Hm-1.0.1.py", line 729, in conf_file
    l = open(self.run_dir(Run)+Run, 'r').readlines()     #list of file lines
IOError: [Errno 2] No such file or directory: 'Path/to/Outputfile'

This results from the thread not correctly running the executable (i.e. why 'Path/to/Outputfile' was not created and hence can't be found). But one of the threads does this correctly while the other two cannot. Is there a reason why I can't get multiple threads running different versions of an executable?

2
  • 3
    use threads I'll problems have two, "I know", I think Commented Apr 24, 2012 at 21:31
  • 1
    A typical issue would be a call to os.chdir() somewhere, as the current working directory is a process wide property. Commented Apr 25, 2012 at 6:06

2 Answers 2

2

I don't think GIL by itself wouldn't kill this by itself unless opening a file gets you into some weird deadlock or spinlock condition. In general, you want threads in cases like this where you're I/O-bound. In fact, the fact that the threads are able to run concurrently probably contributes to the other threads failing rather than successfully opening a file several times.

On slide fifteen of this presentation, the author points out that the GIL releases on blocking I/O calls to give other threads a chance.

The real problem here seems to be a lock on a file resource. I'm not really sure about how Windows works, so I can't speak to why this error is creeping up, but it seems like only one thread actually has a lock on a file resource.

The other poster's point about multiple cores and the GIL might be coming into play, in that you could have some sort of priority inversion going on where the other two threads are getting starved, but I find it unlikely given that the above presentation says that threads in the middle of a blocking operation free the lock for other threads.

One thought is to try multiprocessing. I suspect you'll have better luck with reading the file across multiple processes rather than with threads.

Here is an example I wrote and tried on my OS 10.7.3 machine, it opens up a file test whose contents are lol\n:

import multiprocessing
import os

def open_file(x):
   with open(x, 'r') as file_obj:
     return file_obj.readlines()

a = multiprocessing.Pool(4)
print a.map(open_file, ['test']*4)

Here's the result when I execute it:

➜  ~ git:(master) ✗ python open_test.py
[['lol\n'], ['lol\n'], ['lol\n'], ['lol\n']]
Sign up to request clarification or add additional context in comments.

Comments

2

Python cannot currently exploit multiple cores, because of the Global Interpreter Lock. Multithreading tends to be fraught with trouble, anyway—better to use multiple processes if you can.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.