In an CSV file with python we can read all the file line by line or row by row , I want to read specific line (line number 24 example ) without reading all the file and all the lines.
-
possible duplicate of Start reading and writing on specific line on CSV with PythonGhitaB– GhitaB2015-06-21 11:59:38 +00:00Commented Jun 21, 2015 at 11:59
Add a comment
|
2 Answers
You can use linecache.getline:
linecache.getline(filename, lineno[, module_globals])
Get line lineno from file named filename. This function will never raise an exception — it will return '' on errors (the terminating newline character will be included for lines that are found).
import linecache
line = linecache.getline("foo.csv",24)
Or use the consume recipe from itertools to move the pointer:
import collections
from itertools import islice
def consume(iterator, n):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
with open("foo.csv") as f:
consume(f,23)
line = next(f)
4 Comments
Padraic Cunningham
@xtofl, a file object is its own iterator, when you
for line in f:..., next is repeatedly calleduser3967257
and to start the reading from a specific line and not from the beginning? it workwith simply consume(f,X) and increment the X each time (initialize the X on the desired position), thanks for your usefull answer :)
Padraic Cunningham
@user3967257, use the consume recipe if you want to start from a certain line, the second arg to consume is the amount of lines to consume then just
for line in f... to read the rest of the lines.user3967257
this is what I mean for i in range(X,limit): consume(f,i)
Alternatively you can leverage the nrows and skiprows argument in pandas
line_number = 30
pd.read_csv('big.csv.gz', sep = "\t", nrows = 1, skiprows = line_number - 1)
remember skiprows can be a list so if you need the header use
pd.read_csv('big.csv.gz', sep = "\t", nrows = 1, skiprows = list(range(1, line_number - 1)))