1

I have a folder with the following contents in it:

  1. one folder named: 1_blocks
  2. list of 19 csv files by years: London_255_1999.csv, London_255_2000.csv, …, London_255_2017.csv
  3. one other csv file: London_xyz_combined_output_all_years.csv

The task is to rename only the 19 csv files using a for loop starting with London_255_1999.csv, …, London_255_2017.csv into London_245_1999.csv, …, London_245_2017.csv (i.e. replacing 255 to 245 in each given file name).

Here is my code. I don't want other files and folders to get renamed. Only the 19 files mentioned above.

path = r'A:\Engineering'

for f in os.listdir(path):
    if f.startswith("London_255") and not f.endswith('years.csv'): 
      f_name, f_ext = os.path.splitext(f)

      f_site, f_strings, f_year = f_name.split('_')

      f_strings='245'
      f_site=f_site
      f_year=f_year

      new_name = '{}_{}_{}{}'.format(f_site, f_strings, f_year, f_ext)

      os.rename(f,new_name)

Please suggest the easiest way to rename, if any. I am getting the following error:

f_site, f_strings, f_year = f_name.split('_')
ValueError: not enough values to unpack (expected 3, got 2)

2 Answers 2

2

I modified your code to use a regular expression to find the correct file format and ignore the final csv you mentioned. After it makes a correct match it uses the os and shutil built in libraries to rename the csv files into the new format you specificed.


import shutil, os, re

# create a regex that finds files
# within the specified directory
# that fits the format you wrote about
# above
# Essentially, this regex statement is saying look 
# for the following pattern [any amount of letters]_[3 digits]_[4 digits].csv
file_format_pattern = re.compile(r'[\w]+_[\d]{3}_[\d]{4}\.csv')

path = r'A:\Engineering'

for file in os.listdir(path):
# Loop over the files in the working directory.
    correct_file = file_format_pattern.search(file)
    # Skip files that aren't in the correct format
    if correct_file:
        f_name, f_ext = os.path.splitext(file)
        f_site, f_strings, f_year = f_name.split('_')
        # New Filename with 245 format instead of 255
        new_filename = f_site + "_245_" + f_year + f_ext

        # Get the full, absolute file paths.
        absWorkingDir = os.path.abspath(path)
        original_file = os.path.join(absWorkingDir, file)
        renamed_file = os.path.join(absWorkingDir, new_filename)

        # With shutil rename original filenames to new filename format
        shutil.move(original_file, renamed_file)  

You can learn a lot of cool ways to organize, re-write, read, and modify files automatically following this awesome automation book called "Automate the Boring Stuff" by Al Sweigart.

It's free online here and it teaches you how to utilize Python for automation in an easy to understand manner.

I basically followed the chapter here to help with your question.

For more understand of regex patterns check out chapter 7 of the book. I worked this on my workstation and it correctly renamed the csv files with the 255 format and ignored the final csv.

Please let me know if you have any questions and hope this helps. Cheers!

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks alot! I was thinking if using ReGex was a better idea. I will take a look again on the chapters you mentioned. One more question, do we need this - "absWorkingDir = os.path.abspath(path)", whats the use of os.abspath in this? Thanks again!
@Aakash Not necessarily. You use Absolute paths for paths that begin at the root directory. You can use relative paths (think "../../folder" or "./folder") for folders on a path relative to the folder you are running your program in. Essentially, using Absolute paths makes your program more robust. For a better explanation check out the "Absolute vs. Relative Paths" section of chapter 8 of the Automate the boring stuff book [here] (automatetheboringstuff.com/chapter8)
imo using regex is helpful if you have to deal with a lot of different filenames using the same patterns, e.g. London_255_2345.csv, Egypt_123_9876.csv and so on. But if you only have to handle string matches like London_255_1234.csv, London_255_1235.csv and so on, it is much cleaner to use str.startswith() and str.endswith(). That is a no brainer, not a general solution, though.
@colidyre Good point! Regex may have been a bit of overkill for this problem. As you mentioned, str.startswith() and str.endswith() is a lot more readable for when you return to read or modify the codebase. Regex unless commented well, can be take a while to understand what the regex is doing. There are multiple ways to do something. It's important to develop an intuition for when to use certain tools and optimize.
2

The problem by using the str.split('_')-method together with unpacking the results to exactly 3 variables is that you have to guarantee that there are exactly two underscores in each string you want to split.

The error message ValueError: not enough values to unpack (expected 3, got 2) indicates that you have a string with only one underscore in your directory.

See:

a, b, c = "foo_bar".split("_")
ValueError: not enough values to unpack (expected 3, got 2)

So your code should work if only the files you have listed are in the given folder. But it seems that it is not the case.

It seems that there is at least one file (this also applies for a folder) with only one underscore in your given folder which also starts with London_255 and does not end with years.csv.

So you can just proof if the string contains 2 underscores before splitting and unpacking it or look into the directory and control the files in the folder manually.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.