I am writing a python program that runs an external model in parallel a certain number of times to define data in a parameter space. Because of how the external model is written (for good reasons I promise), I have to make a new copy of the model folder if I want to run it at the same time. I made the copies of my main model folder foo, and called them foo0, foo1, foo2, and foo3. Now I want to be able to go into the specific directory based on the thread, make some changes, run the model, write to a main file, and then move to the next run. Each model run can take between 30s to 200s, hence the benefit of parallel vs serial runs.
import subprocess
from joblib import Parallel
def run_model(base_path):
#Make some changes using a random number generator in the folder
....
#Run the model using bash on windows. Note the str(threading.get_ident()) is my
#attempt to get the thread 0,1,2,3
subprocess.run(['bash','-c', base_path + str(threading.get_ident()) + '/Model.exe'])
#Write some input for the run to a main file that will store all runs
with open('Inputs.txt','a') as file:
with open(base_path + str(threading.get_ident()) + '/inp.txt') as inp_file:
for i,line in enumerate(inp_file):
if i == 5:
file.write(line)
Parallel(n_jobs=4, backend="threading")(run_model('Models/foo') for i in range(0,10000))
However, I keep getting the FileNotFoundError since the thread id keeps changing and the folder does not exist. The model is large so copying the model with a new thread id (something like a folder named foo+thread_id) is both slow and uses a lot of disk space. Is there any way I can limit only a certain copy of a model to run on a certain thread making sure it is not being used by any other thread?