2

I have the following code which is intended to remove specific lines of a file. When I run it, it prints the two filenames that live in the directory, then deletes all information in them. What am I doing wrong? I'm using Python 3.2 under Windows.

import os

files = [file for file in os.listdir() if file.split(".")[-1] == "txt"]

for file in files:
    print(file)
    input = open(file,"r")
    output = open(file,"w")

    for line in input:
        print(line)
        # if line is good, write it to output

    input.close()
    output.close()
7
  • 3
    Note: You should use os.path.splitext to get the file extension. Also you should read the file and then write to it after. Commented Jul 26, 2012 at 15:13
  • 1
    Do you want to write to the same file you open for reading? Commented Jul 26, 2012 at 15:14
  • 2
    @jamylak: No, the right solution would be to iterate over glob.iglob("*.txt"). Commented Jul 26, 2012 at 15:14
  • 1
    @SvenMarnach Ok that's better but I just meant to check file extensions. Commented Jul 26, 2012 at 15:15
  • 4
    @poke, that's the comment in the inside for loop. Before running the code I would put something there. Commented Jul 26, 2012 at 15:18

5 Answers 5

7

open(file, 'w') wipes the file. To prevent that, open it in r+ mode (read+write/don't wipe), then read it all at once, filter the lines, and write them back out again. Something like

with open(file, "r+") as f:
    lines = f.readlines()              # read entire file into memory
    f.seek(0)                          # go back to the beginning of the file
    f.writelines(filter(good, lines))  # dump the filtered lines back
    f.truncate()                       # wipe the remains of the old file

I've assumed that good is a function telling whether a line should be kept.

Sign up to request clarification or add additional context in comments.

Comments

3

If your file fits in memory, the easiest solution is to open the file for reading, read its contents to memory, close the file, open it for writing and write the filtered output back:

with open(file_name) as f:
    lines = list(f)
# filter lines
with open(file_name, "w") as f:      # This removes the file contents
    f.writelines(lines)

Since you are not intermangling read and write operations, the advanced file modes like "r+" are unnecessary here, and only compicate things.

If the file does not fit into memory, the usual approach is to write the output to a new, temporary file, and move it back to the original file name after processing is finished.

2 Comments

However, r+ has the nice property of failing early when the file cannot be opened for reading.
@larsmans: So it will save the fraction of a second in the case that the job cannot be done anyway. I don't think this is worth the trouble.
1

One way is to use the fileinput stdlib module. Then you don't have to worry about open/closing and file modes etc...

import fileinput
from contextlib import closing
import os

fnames = [fname for fname in os.listdir() if fname.split(".")[-1] == "txt"] # use splitext
with closing(fileinput.input(fnames, inplace=True)) as fin:
    for line in fin:
        # some condition
        if 'z' not in line: # your condition here
            print line, # suppress new line but adjust for py3 - print(line, eol='') ?

When using inplace=True - the fileinput redirects stdout to be to the file currently opened. A backup of the file with a default '.bak' extension is created which may come in useful if needed.

jon@minerva:~$ cat testtext.txt
one
two
three
four
five
six
seven
eight
nine
ten

After running the above with a condition of not line.startswith('t'):

jon@minerva:~$ cat testtext.txt
one
four
five
six
seven
eight
nine

Comments

0

You're deleting everything when you open the file to write to it. You can't have an open read and write to a file at the same time. Use open(file,"r+") instead, and then save all the lines to another variable before writing anything.

Comments

0

You should not open the same file for reading and writing at the same time.

"w" means create a empty for writing. If the file already exists, its data will be deleted.

So you can use a different file name for writing.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.