59

In python, is there a built-in way to do a readline() on string? I have a large chunk of data and want to strip off just the first couple lines w/o doing split() on the whole string.

Hypothetical example:

def handleMessage(msg):
   headerTo  = msg.readline()
   headerFrom= msg.readline()
   sendMessage(headerTo,headerFrom,msg)

msg = "Bob Smith\nJane Doe\nJane,\nPlease order more widgets\nThanks,\nBob\n"
handleMessage(msg)

I want this to result in: sendMessage("Bob Smith", "Jane Doe", "Jane,\nPlease order...")

I know it would be fairly easy to write a class that does this, but I'm looking for something built-in if possible.

EDIT: Python v2.7

0

5 Answers 5

100

In Python 3, you can use io.StringIO:

>>> msg = "Bob Smith\nJane Doe\nJane,\nPlease order more widgets\nThanks,\nBob\n"
>>> msg
'Bob Smith\nJane Doe\nJane,\nPlease order more widgets\nThanks,\nBob\n'
>>>
>>> import io
>>> buf = io.StringIO(msg)
>>> buf.readline()
'Bob Smith\n'
>>> buf.readline()
'Jane Doe\n'
>>> len(buf.read())
44

In Python 2, you can use StringIO (or cStringIO if performance is important):

>>> import StringIO
>>> buf = StringIO.StringIO(msg)
>>> buf.readline()
'Bob Smith\n'
>>> buf.readline()
'Jane Doe\n'
Sign up to request clarification or add additional context in comments.

2 Comments

Perfect. buf.read() what's left of the buffer after reading a certain number of lines.
Why allocating extra memory for buf?
41

The easiest way for both python 2 and 3 is using string's method splitlines(). This returns a list of lines.

>>> "some\nmultilene\nstring\n".splitlines()

['some', 'multilene', 'string']

4 Comments

And str.splitlines() is cross-platform, regardless the style of line endings like \r\n or \n.
I think that this is more convenient than the accepted answer.
the fun fact is that it removes all the \n or \r\n, so if you need to keep them as is, but split the lines, better go for the StringIO or put keepend = True arg in the splitlines()
@RatulHasan from doc: "str.splitlines(keepends=False)", set the keepends to True if you want to keep them.
22

Why not just only do as many splits as you need? Since you're using all of the resulting parts (including the rest of the string), loading it into some other buffer object and then reading it back out again is probably going to be slower, not faster (plus the overhead of function calls).

If you want the first N lines separated out, just do .split("\n", N).

>>> foo = "ABC\nDEF\nGHI\nJKL"
>>> foo.split("\n", 1)
['ABC', 'DEF\nGHI\nJKL']
>>> foo.split("\n", 2)
['ABC', 'DEF', 'GHI\nJKL']

So for your function:

def handleMessage(msg):
   headerTo, headerFrom, msg = msg.split("\n", 2)
   sendMessage(headerTo,headerFrom,msg)

or if you really wanted to get fancy:

# either...
def handleMessage(msg):
   sendMessage(*msg.split("\n", 2))

# or just...
handleMessage = lambda msg: sendMessage(*msg.split("\n", 2))

2 Comments

This works if you don't mind dealing with exceptions and/or checking array lengths that come with split AND you know how many lines to read in advance. I don't know how many header lines there are in advance. Perhaps my example was excessively trivial.
If you're inspecting things as you go, then yes, StringIO is probably your best bet. (No worries re: examples - constructing the appropriate example can often be a difficult balance between simplifying and not losing context.)
6

Do it like StringIO does it:

i = self.buf.find('\n', self.pos)

So this means:

pos = msg.find("\n")
first_line = msg[:pos]
...

Seems more elegant than using the whole StringIO...

Comments

4

in Python string have method splitlines

msg = "Bob Smith\nJane Doe\nJane,\nPlease order more widgets\nThanks,\nBob\n"
msg_splitlines = msg.splitlines()
headerTo = msg_splitlines[0]
headerFrom= msg_splitlines[1]
sendMessage(headerTo,headerFrom,msg)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.