I have five .csv's that have the same fields in the same order that need to be processed as such:
- Get list of files
- Make each file into a dataframe
- Check if a column of letter-number combinations has a specific value (different for each file) eg: check if the number
PT333is incolumn1for the file namedata1:
column1 column2 column3
PT389 LA image.jpg
PT372 NY image2.jpg
- If the column has a specific value, print which value it has and the filename/variable name that i've assigned to that file, and then rename that dataframe to
output1
I tried to do this, but I don't know how to make it loop and do the same thing for each file.
At the moment it returns the number, but I also want it to return the data frame name, and I also want it to loop through all the files (a to e) to check for all the values in the numbers list.
This is what I have:
import os
import glob
import pandas as pd
from glob import glob
from os.path import expanduser
home = expanduser("~")
os.chdir(home + f'/files/')
data = glob.glob('data*.csv')
data
# If you have tips on how to loop through these rather than
# have a line for each one, open to feedback
a = pd.read_csv(data[0], encoding='ISO-8859-1', error_bad_lines=False)
b = pd.read_csv(data[1], encoding='ISO-8859-1', error_bad_lines=False)
c = pd.read_csv(data[2], encoding='ISO-8859-1', error_bad_lines=False)
d = pd.read_csv(data[3], encoding='ISO-8859-1', error_bad_lines=False)
e = pd.read_csv(data[4], encoding='ISO-8859-1', error_bad_lines=False)
filenames = [a,b,c,d,e]
filelist= ['a','b','c','d','e']
# I am aware that this part is repetitive. Unsure how to fix this,
# I keep getting errors
# Any help appreciated
numbers = ['PT333', 'PT121', 'PT111', 'PT211', 'PT222']
def type():
for i in a.column1:
if i == numbers[0]:
print(numbers[0])
elif i == numbers[1]:
print(numbers[1])
elif i == numbers[2]:
print(numbers[2])
elif i == numbers[3]:
print(numbers[3])
elif i == numbers[4]:
print(numbers[4])
type()
Also happy to take any constructive criticism as to how to repeat less code and make things smoother. TIA