I am currently working on some Data Analytics work and I'm having a bit of trouble with the Data Preprocessing.
I have compiled a folder of text files, with the name of the text file being the date that the text file corresponds to. I was originally able to append all of the text files to one document, but I wanted to use a dictionary in order to have 2 attributes, the filename (also the date) and the content in the text file.
This is the code:
import json
import os
import math
# Define output filename
OutputFilename = 'finalv2.txt'
# Define path to input and output files
InputPath = 'C:/Users/Mike/Desktop/MonthlyOil/TextFiles'
OutputPath = 'C:/Users/Mike/Desktop/MonthlyOil/'
# Convert forward/backward slashes
InputPath = os.path.normpath(InputPath)
OutputPath = os.path.normpath(OutputPath)
# Define output file and open for writing
filename = os.path.join(OutputPath,OutputFilename)
file_out = open(filename, 'w')
print ("Output file opened")
size = math.inf
def append_record(record):
with open('finalv2.txt', 'a') as f:
json.dump(record, f)
f.write(json.dumps(record))
# Loop through each file in input directory
for file in os.listdir(InputPath):
# Define full filename
filename = os.path.join(InputPath,file)
if os.path.isfile(filename):
print (" Adding :" + file)
file_in = open(filename, 'r')
content = file_in.read()
dict = {'filename':filename,'content':content}
print ("dict['filename']: ", dict['filename'] )
append_record(dict)
file_in.close()
# Close output file
file_out.close()
print ("Output file closed")
The problem I am experiencing is that it won't append my file, I havea line in there which tests whether or not the dict contains anything and it does, I have tested both content and filename.
Any ideas what I'm missing to get the dict appended to the file?
forloop block be indented insideappend_record()?for fileblock, is indented an extra stop, but assuming that is just a copy-paste formatting error?