TLDR: Script A creates a directory and writes files in it. Script B periodically checks that directory. How does script B know when script A is done writing so that it can access the files?
I have a Python script (call it the render server) that receives requests to generate images and associated data. I need to run a separate Python application (call it the consumer) that makes use of this data . The consumer does not know when new data will be available. Ideally it should not have to know of the presence of script A, just that data somehow becomes available.
My quick and dirty solution is to have an outputs directory known to both Python scripts. In that directory, the render server creates timestamped directories and saves several files within those directories.
The render server does something like:
os.makedirs('outputs/' + timestamped_subdir)
# Write files into that directory.
The consumer checks that directory kind of like:
dirs = set()
while True:
new_dirs = set(glob('outputs/*')).difference(dirs)
if not len(new_dirs):
continue
# Do stuff with the contents of the latest new directory.
The problem is that the consumer checks the contents of the directory before the render server finishes writing (and this is evident in a FileNotFoundError). I tried to fix this by making the render server do:
os.makedisr('temp')
# Write files into that directory.
shutil.copytree('temp', 'outputs/' + timestamped_subdir)
But the consumer is still able to know of the presence of the timestamped_subdir before the files within are done being copied (again there's a FileNotFoundError). What's one "right" way to do what I'm trying to achieve?
Note: While writing this I realised I should do shutil.move instead of shutil.copytree and that seems to have fixed it. But I'm still not sure enough of the underlying mechanisms of that operation to know for sure that it works correctly.
'outputs/' + timestamped_subdir + '_temp'. When the "render server" is finished with that directory, change it to do anos.rename('outputs/' + timestamped_subdir + '_temp', 'outputs/' + timestamped_subdir). That rename will be atomic as long as everything resides on the same filesystem. Now your other process just have to ignore the directories ending in_temp, and when it sees another folder, it'll know those are finished and complete. If you can't change the"render server", it's a whole different issueshutil.movewhich I believe is the same asos.rename. And if the answer is "yes it is the same", cool. Just want to know that others believe this is a solid solution.