0

I have 169 CSV files with an identical structure (90 columns with the same headers) and an identical naming system.

Screenshot of file names

The files are named as such:

  • 2019-v-1
  • 2019-v-2
  • 2019-v-3
  • etc.

For each CSV, I would like to add a column, with the header 'Visits', and for the value in that column to be taken from the file name (the number at the end, after the second dash).

So, for example, the first CSV will have a new column called 'Visits', where every row is given the value '1' in that column.

If there is a Python solution, that would be amazing. I don't come from a coding background, and that's the only language I've got somewhat familiar with, but I can't seem to figure this one out myself.

Any help would be massively appreciated - thank you!

2 Answers 2

1
import pandas as pd
import os

def csv_folder_input(path,folder):
    path = os.path.join(path,folder)
    os.chdir(path)
    counter=1
    for filename in os.listdir(path):
        if filename.endswith(".csv"):
             with open(filename, 'r') as csvfile:
             counter=counter+1
             df = pd.read_csv(csvfile)
             df['Visits']=int(filename.split('_')[2].split('.')[0])
             df.to_csv(filename)
       
csv_folder_input(your path name,your folder name)

Put in your path name followed by your folder name. I can see that your folder name is 2019-v. enter the appropriate path name prior to the folder and ensure that the correct path format for a MacOS is entered. It should work fine I believe.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you so much for offering a solution! Not to ask a total newbie question, but how/where would I add my path name and folder name? I tried, but it came up with a syntax error.
This is a function. So what you do is. csv_folder_input(your path name, your folder name) i have edited the answer to make it easier for you
Amazing. Thanks so much for taking the time to help me out so much with this :)
1

First, you need a list of the files:

from os import listdir
from os.path import isfile, join
import csv       # You'll need this for the next step

mypath = 'path/to/csv/directory'
allfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

Then you want to open each file, add the column, and save it again.

from os import listdir
from os.path import isfile, join
import csv

mypath = 'path/to/csv/directory'
allfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

for file in allfiles:

  # This will only work if your filenames are consistent.
  # We split the name into a list, ['2019', 'v', '1.csv']
  #   then take the third item (index 2), and remove the
  #   last 4 characters.

  number_at_end = file.split("-")[2][:-4]

  # We open the file...

  with open(file, newline='') as csvfile:
    reader = csv.reader(csvfile)

  # Add the column name to the first row, and
  # add the value to each row...

  for i, row in enumerate(reader):
    if i == 0:
      row.append('Visits')
    else:
      row.append(number_at_end)

  # and then write the file back.

  with open(file, newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(reader)

1 Comment

Thanks so much for the help! This looks really promising. I keep getting the following error: '_csv.reader' object is not subscriptable

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.