I have many text files around 3000 files in a folder and in each file the 193rd line is the only line which has important information. How can i read all these files into 1 single text file using python.
4 Answers
There is a function named list dir in the os module. this function returns a list of all files in a given directory. Then you can access each file using a for a loop. This tutorial will be helpful for listdir function. https://www.geeksforgeeks.org/python-os-listdir-method/
The example code is below.
import os
# your file path
path = ""
# to store the files or you can give this direct to the for loop like below
# for file in os.listdir(path)
dir_files = []
dir_files = os.listdir(path)
# to store your important text files.
# this list stores your important text line (193th line) in each file
texts = []
for file in dir_files:
with open(path + '\\' + file) as f:
# this t varialble store your 192th line in each cycle
t = f.read().split('\n')[192]
# this file append each file into texts list
texts.append(t)
# to print each important line
for text in texts:
print(text)
Comments
You can loop through your file paths, read the text from every file and then split lines using the line seperator in your OS.
import os
file_paths = [f"/path/to/file{i}.ext" for i in range(3000)]
info = []
for p in file_paths:
with open(p, "r") as file:
text = file.read(p)
line_list = text.split(os.linesep)
info.append(line_list[192])
2 Comments
Martynas Markevičius
Reading the file as a string and then splitting it by
linesep is essentially the same as file.readlines(). Also, using list comprehension to get a list of files like that is kind of unreliable because not every file may have the same name, nor there would be exactly 3000 files. Best is to do os.listdir("/path/to/folder") to get file namesPhilipp
I used the list comprehension just to create a runnable example. I did not know about
readlines().. Learned something today :)You could try that, if its what you want:
import os
important_lines = []
for textfile in os.listdir('data'):
text_from_file = open( os.path.join('data', textfile) ).readlines()
if len(text_from_file) >= 192:
important_line = text_from_file[192]
important_lines.append(important_line)
#list of 192th lines from files
2 Comments
Rishu Mcr
this reads the files rather randomly is there a chance to read in order of time added?
Stsh4lson
i can use something like this? Where
files is list of files: ascending: sorted_by_mtime_ascending = sorted(files, key=lambda t: os.stat(t).st_mtime) descending:sorted_by_mtime_descending = sorted(files, key=lambda t: -os.stat(t).st_mtime)