0

I am currently working on some Data Analytics work and I'm having a bit of trouble with the Data Preprocessing.

I have compiled a folder of text files, with the name of the text file being the date that the text file corresponds to. I was originally able to append all of the text files to one document, but I wanted to use a dictionary in order to have 2 attributes, the filename (also the date) and the content in the text file.

This is the code:

import json
import os
import math

# Define output filename
OutputFilename = 'finalv2.txt'

# Define path to input and output files
InputPath  = 'C:/Users/Mike/Desktop/MonthlyOil/TextFiles'
OutputPath = 'C:/Users/Mike/Desktop/MonthlyOil/'

# Convert forward/backward slashes
InputPath  = os.path.normpath(InputPath)
OutputPath = os.path.normpath(OutputPath)

# Define output file and open for writing
filename = os.path.join(OutputPath,OutputFilename)
file_out = open(filename, 'w')
print ("Output file opened")

size = math.inf

def append_record(record):
    with open('finalv2.txt', 'a') as f:
        json.dump(record, f)
        f.write(json.dumps(record))

# Loop through each file in input directory
    for file in os.listdir(InputPath):
    # Define full filename
    filename = os.path.join(InputPath,file)
    if os.path.isfile(filename):
        print ("  Adding :" + file)
        file_in = open(filename, 'r')
        content = file_in.read()
        dict = {'filename':filename,'content':content}
        print ("dict['filename']: ", dict['filename'] )     
        append_record(dict)    
        file_in.close()


# Close output file
file_out.close()
print ("Output file closed")

The problem I am experiencing is that it won't append my file, I havea line in there which tests whether or not the dict contains anything and it does, I have tested both content and filename.

Any ideas what I'm missing to get the dict appended to the file?

2
  • 2
    Is your indentation correct? Should the for loop block be indented inside append_record()? Commented Aug 1, 2016 at 23:17
  • 1
    As it is entered, the for file block, is indented an extra stop, but assuming that is just a copy-paste formatting error? Commented Aug 1, 2016 at 23:40

1 Answer 1

3

There are many issues, but the one that is causing the trouble here is that you're opening finalv2.txt twice. Once with mode w (and doing nothing with it), and again inside append_record(), this time with mode a.

Consider the following:

import json
import os
import math

# Define output filename
OutputFilename = 'finalv2.txt'

# Define path to input and output files
InputPath  = 'C:/Users/Mike/Desktop/MonthlyOil/TextFiles'
OutputPath = 'C:/Users/Mike/Desktop/MonthlyOil/'

# Convert forward/backward slashes
InputPath  = os.path.normpath(InputPath)
OutputPath = os.path.normpath(OutputPath)

# Define output file
out_file = os.path.join(OutputPath,OutputFilename)

size = None

def append_record(fn, record):
    with open(fn, 'a') as f:
        json.dump(record, f)
        #f.write(json.dumps(record))

# Loop through each file in input directory
for fn in os.listdir(InputPath):
    # Define full filename
    in_file = os.path.join(InputPath,fn)
    if os.path.isfile(in_file):
        print("  Adding: " + fn)
        with open(in_file, 'r') as file_in:
            content = file_in.read()
            d = {'filename':in_file, 'content':content}
            print("d['filename']: ", d['filename'] )
            append_record(out_file, d)

Which works as you expected.

Here:

  • Files aren't explicitly opened and closed, they're managed by context managers (with)
  • There are no longer variables named dict and file
  • You define finalv2.txt in one place, and one place only
  • filename is not defined twice, once as the output file and then again as the input file. Instead there are out_file and in_file
  • You pass the output filename to your append_record function
  • You don't (attempt to) append the json twice -- only once (you can pick which method you prefer, they both work)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.