Multiline file read in Python

Question

I am looking for a method in Python which can read multiple lines from a file(10 lines at a time). I have already looked into readlines(sizehint), I tried to pass value 10 but doesn't read only 10 lines. It actually reads till end of the file(I have tried on the small file). Each line is 11 bytes long and each read should fetch me 10 lines each time. If less than 10 lines are found then return only those lines. My actual file contains more than 150K lines.

Any idea how I can achieve this?

readlines reads the whole file. The optional argument is not a limit to how many lines. — jdi
– jdi, Commented Oct 8, 2012 at 23:48

Ashwini Chaudhary · Accepted Answer · 2015-08-07 10:01:13Z

9

You're looking for itertools.islice():

with open('data.txt') as f:
    lines = []
    while True:
        line = list(islice(f, 10)) #islice returns an iterator ,so you convert it to list here.
        if line:                     
            #do something with current set of <=10 lines here
            lines.append(line)       # may be store it 
        else:
            break
    print lines

edited Aug 7, 2015 at 10:01

answered Oct 8, 2012 at 23:52

Ashwini Chaudhary

252k60 gold badges478 silver badges519 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

inspectorG4dget Over a year ago

The problem with using islice is that the isliceobject is read-once only. So if the OP wants to check previous lines based on conditions on the current line, he would have to cache it manually. Might be better to use a list (depending on the use case)

jdi Over a year ago

@inspectorG4dget: But the same problem exists for your answer as well. You are reading 10 lines and returning them. It is still up to the caller to cache previous batches of lines.

inspectorG4dget Over a year ago

@jdi: not what I meant. If I read lines 1-5 and on line 6, wanted to look at line 3, I can't do that with islice without another cache. I was referring to lines in the same batch, not lines in previous batches

jdi Over a year ago

He is converting it to a list right here in the code. All it requires is assigning it to a variable. I think the print is just for simplicity sake.... lines = list(lines)

jdi Over a year ago

I only see one edit. It originally was being converted to a list, just printed..not saved. Anyways. +1 good answer.

inspectorG4dget · Accepted Answer · 2012-10-08 23:47:16Z

3

This should do it

def read10Lines(fp):
    answer = []
    for i in range(10):
        answer.append(fp.readline())
    return answer

Or, the list comprehension:

ten_lines = [fp.readline() for _ in range(10)]

In both cases, fp = open('path/to/file')

answered Oct 8, 2012 at 23:47

inspectorG4dget

115k30 gold badges159 silver badges253 bronze badges

Comments

mgilson · Accepted Answer · 2012-10-09 00:53:05Z

1

Another solution which can get rid of the silly infinite loop in favor of a more familiar for loop relies on itertools.izip_longest and a small trick with iterators. The trick is that zip(*[iter(iterator)]*n) breaks iterator up into chunks of size n. Since a file is already generator-like iterator (as opposed to being sequence like), we can write:

from itertools import izip_longest
with open('data.txt') as f:
    for ten_lines in izip_longest(*[f]*10,fillvalue=None):
        if ten_lines[-1] is None:
           ten_lines = filter(ten_lines) #filter removes the `None` values at the end
        process(ten_lines)

edited Oct 9, 2012 at 0:53

answered Oct 9, 2012 at 0:36

mgilson

312k70 gold badges656 silver badges722 bronze badges

Comments

John La Rooy · Accepted Answer · 2012-10-09 00:16:07Z

0

from itertools import groupby, count
with open("data.txt") as f:
    groups = groupby(f, key=lambda x,c=count():next(c)//10)
    for k, v in groups:
        bunch_of_lines = list(v)
        print bunch_of_lines

answered Oct 9, 2012 at 0:16

John La Rooy

306k54 gold badges378 silver badges514 bronze badges

Collectives™ on Stack Overflow

Multiline file read in Python

4 Answers 4

5 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related