25

Currently I have a loop that tries to find an unused filename by adding suffixes to a filename string. Once it fails to find a file, it uses the name that failed to open a new file wit that name. Problem is this code is used in a website and there could be multiple attempts to do the same thing at the same time, so a race condition exists.

How can I keep python from overwriting an existing file, if one is created between the time of the check and the time of the open in the other thread.

I can minimize the chance by randomizing the suffixes, but the chance is already minimized based on parts of the pathname. I want to eliminate that chance with a function that can be told, create this file ONLY if it doesn't exist.

I can use win32 functions to do this, but I want this to work cross platform because it will be hosted on linux in the end.

5
  • If I had to do something like that, I'd use a predefined file name and append the current time/date to it - that way, you will be guaranteed a unique file name regardless. Commented Aug 28, 2009 at 16:22
  • Date is currently in the filename, the problem is on a heavily loaded webserver, you could easily have 2 requests in the same second. Commented Aug 28, 2009 at 17:02
  • 4
    Use uuid.uuid1() to create files with globally unique names. Commented Aug 28, 2009 at 17:44
  • I wrote a small Python package seqfile to solve this problem by generating sequential filenames in a unicode-safe, thread-safe, and OS-safe manner. Commented May 8, 2015 at 9:13
  • Long ago ... but perhaps someone else is looking for solutions here. We had a related discussion over here. Perhaps check out my OS-indpendent locking-by-directory github.com/drandreaskrueger/lockbydir Commented Mar 6, 2016 at 22:14

4 Answers 4

39

Use os.open() with os.O_CREAT and os.O_EXCL to create the file. That will fail if the file already exists:

>>> fd = os.open("x", os.O_WRONLY | os.O_CREAT | os.O_EXCL)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 17] File exists: 'x'

Once you've created a new file, use os.fdopen() to turn the handle into a standard Python file object:

>>> fd = os.open("y", os.O_WRONLY | os.O_CREAT | os.O_EXCL)
>>> f = os.fdopen(fd, "w")  # f is now a standard Python file object

Edit: From Python 3.3, the builtin open() has an x mode that means "open for exclusive creation, failing if the file already exists".

Sign up to request clarification or add additional context in comments.

1 Comment

on Linux/python 2.6, os.fdopen(..) throws an OSError with errno 22 for the above example because the mode argument left to its default ('r'). f = os.fdopen(fd,"w") however works.
7

If you are concerned about a race condition, you can create a temporary file and then rename it.

>>> import os
>>> import tempfile
>>> f = tempfile.NamedTemporaryFile(delete=False)
>>> f.name
'c:\\users\\hughdb~1\\appdata\\local\\temp\\tmpsmdl53'
>>> f.write("Hello world")
>>> f.close()
>>> os.rename(f.name, r'C:\foo.txt')
>>> if os.path.exists(r'C:\foo.txt') :
...     print 'File exists'
...
File exists

Alternatively, you can create the files using a uuid in the name. Stackoverflow item on this.

>>> import uuid
>>> str(uuid.uuid1())
'64362370-93ef-11de-bf06-0023ae0b04b8'

1 Comment

I am checking to see if it exists, I'm worried about a race condition as stated above. TemporaryFile doesn't have delete as a parameter. NamedTemporaryFile does though (in v2.6), Thanks for the pointer to this part of the python library I did not know existed. The UUID thing would probably work but seems a bit exotic for what I really need.
0

If you have an id associated with each thread / process that tries to create the file, you could put that id in the suffix somewhere, thereby guaranteeing that no two processes can use the same file name.

This eliminates the race condition between the processes.

1 Comment

This may be a valid assumption for local (non-networked) filesystems (on UNIX like system). (Of course there are other concerns if the open() might be executed on older versions of NFS, older Linux or other OS kernels, etc: stackoverflow.com/questions/3406712/…
0

You might as well use something like this for file name checking. You provide the name of the file and optionally the extension of the file that you want to create. If there's a file present in the cwd directory with the same name it will return the name incremented by (index), else it will return the same name.

import os

def nameIndexGenerator(name, fileExtension=''):
    if fileExtension:
        if not (os.path.exists(f'{name}.{fileExtension}')):
            return (f'{name}.{fileExtension}')
        i = 1
        while os.path.exists(f'{name}({i}).{fileExtension}'):
            i += 1
        return (f'{name}({i}).{fileExtension}')
    else:
        if not (os.path.exists(f'{name}')):
            return (f'{name}')
        i = 1
        while os.path.exists(f'{name}({i})'):
            i += 1
        return (f'{name}({i})')

1 Comment

This ignores race conditions entirely. You would be better off creating a file with a uuid based name and then renaming it, using the failures to generate a new name. The amount of time between os.path.exists and returning the filename (and then creating the file is too great on a heavily loaded system for this to be rock solid.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.