0

I'm writing a code whose object is to read a txt file (fileONE.txt) and keep only the data that is in another txt file (fileTWO.txt), in another file (fileTHREE.txt), just like the example below:

fileONE.txt
aaa bbb cccddd
ccc ddd aaa eee

fileTWO.txt
aaa
bbb
ccc

final result - fileTHREE.txt
aaa bbb ccc 
ccc aaa

Note that only the data that is in the fileTWO.txt is kept in the destination file (fileTHREE.txt).

The code follows below and apparently works, however, some data it does not get to delete in the final file. For example, the data 'ddd' that should come out, continues.

id_codigo = open('fileONE.txt', 'r', -1, encoding="utf-8")
with open('fileTWO.txt') as f:
        for line in f:
            key1 = ''
            (key) = line.split()
            key1 = key1.join(key)
            id_codigo = id_codigo.replace(key1, " "+key1+" ")
    id_codigo = id_codigo.split()

    with open('fileTWO.txt') as f:
        file_elements = f.read().splitlines()
        for i in id_codigo:
            if i not in file_elements:
                id_codigo.remove(i)
    id_codigo1 = id_codigo1.join(id_codigo)    
    return id_codigo1

Explaining: the 'fileONE.txt' goes to id_codigo. In the first while open, I split each symbol in the 'fileTWO.txt' and replace it in the id_codigo. (Some symbols come together. The first while serves to separate the symbols together as well)

In the second while, I delete (of id_codigo) those that are not part of the 'fileTWO.txt'. And I return everything in a id_codigo1 to write to another file (fileTHREE.txt).

Apparently he does what he should do, but some symbols that shouldn't be, are getting, like 'ddd', for example. Someone can check if something is missing. I've debugged and can't find the error.

1 Answer 1

1

it is not a good idea to open the second file in a loop, maybe you can do this

keep = None
with open('fileTWO.txt', 'r', encoding='utf-8') as f:
    keep = f.read().splitlines()

result = []
with open('fileONE.txt', 'r', encoding='utf-8') as f:
    lines = f.read().splitlines()
    for line in lines:
        result.append(list(filter(lambda x: x in keep, line.split(" "))))

with open('fileTHREE.txt', 'w', encoding='utf-8') as f:
    for line in result:
        f.writelines(" ".join(line) + "\n")
Sign up to request clarification or add additional context in comments.

1 Comment

For some it worked, but in some files the symbols come together, like "aaabbb". The first while in my code separated by space. I will try to separate them within your second with.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.