2

I want to download some files and save them in a folder and there may be some duplication in file names, so I want to avoid this to happen. I think it needs an auto-naming system but now i don't know how to make it. I used shutil and urllib2 to write my function.

This is a part of my code :

path = 'C:/DL/Others/'+filename+file_ext
with open(path, 'wb') as fp:
    shutil.copyfileobj(req, fp)

As you know we can check that if a file exists or not by os.path.exists('path'). I wanna to rename my files and save them to avoid duplicated names using a pattern, for example by adding a number to file name.So if there was 4 files with same name, "fname", I want 4 files in this pattern : fname - fname(1) - fname(2) - fname(3)

2 Answers 2

6

Something like this is probably reasonable:

path = 'c:/DL/Others/%s%s' % (filename, file_ext)
uniq = 1
while os.path.exists(path):
  path = 'c:/DL/Others/%s_%d%s' % (filename, uniq, file_ext)
  uniq += 1

If the original path doesn't exist you get no _1, but if it does exist it'll count up until it finds one that's free.

Sign up to request clarification or add additional context in comments.

4 Comments

Where obviously you can swap out the _ for parens if you prefer.
Thanks! Simple and effective :)
This is terribly inefficient if you have lots of duplicates, though. If you have >100 duplicates, you have to do >5000 iterations of that loop in total.
@DanielDiPaolo - Only if you reset the uniq indicator for each file which isn't necessary if you know they all share a common prefix. In any event, it's simple enough to memo-ize the created file names if the volume does grow terribly large.
1

Track each filename's count as you create it:

fname_counts = {}

# ... whatever generates filename and file_ext goes here...

if filename + file_ext in fname_counts:
     fname_counts[filename + file_ext] += 1
else:
     fname_counts[filename + file_ext] = 0


# now check if it's a dupe when you create the path
if fname_counts[filename + file_ext]:
     path = 'C:/DL/Others/%s_%s.%s' % (filename, fname_counts[filename + file_ext], file_ext)
else:
     path = 'C:/DL/Others/' + filename + file_ext 

Example at work with two duplicates ("test.txt"):

>>> filenames_and_exts = [('test', '.txt'), ('test', '.txt'), ('test2', '.txt'), ('test', '.cfg'), ('different_name', '.txt')]
>>> fname_counts = {}
>>> for filename, file_ext in filenames_and_exts:
    if filename + file_ext in fname_counts:
        fname_counts[filename + file_ext] += 1
    else:
        fname_counts[filename + file_ext] = 0
    if fname_counts[filename + file_ext]:
        path = 'C:/DL/Others/%s_%s%s' % (filename, fname_counts[filename + file_ext], file_ext)
    else:
        path = 'C:/DL/Others/' + filename + file_ext
    print path


C:/DL/Others/test.txt
C:/DL/Others/test_1.txt
C:/DL/Others/test2.txt
C:/DL/Others/test.cfg
C:/DL/Others/different_name.txt

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.