3

I have a multi-line string:

inputString = "Line 1\nLine 2\nLine 3"

I want to have an array, each element will have maximum 2 lines it it as below:

outputStringList = ["Line 1\nLine2", "Line3"]

Can i convert inputString to outputStringList in python. Any help will be appreciated.

2
  • 1
    What have you tried so far? Please show your code. Commented Sep 15, 2017 at 7:39
  • 1
    Also, please explain WHY you'd want some lines to be split, but not others. This makes no sense. Commented Sep 15, 2017 at 7:42

7 Answers 7

3

you could try to find 2 lines (with lookahead inside it to avoid capturing the linefeed) or only one (to process the last, odd line). I expanded your example to show that it works for more than 3 lines (with a little "cheat": adding a newline in the end to handle all cases:

import re

s = "Line 1\nLine 2\nLine 3\nline4\nline5"
result = re.findall(r'(.+?\n.+?(?=\n)|.+)', s+"\n")

print(result)

result:

['Line 1\nLine 2', 'Line 3\nline4', 'line5']

the "add newline cheat" allows to process that properly:

    s = "Line 1\nLine 2\nLine 3\nline4\nline5\nline6"

result:

['Line 1\nLine 2', 'Line 3\nline4', 'line5\nline6']
Sign up to request clarification or add additional context in comments.

2 Comments

Nice, but not working for s = "Line 1\nLine 2\nLine 3\nline4\nline5\nline6"
@Frane right! see my edit. Simple and handles all cases, odd, even, ending with newline or not.
2

Here is an alternative using the grouper itertools recipe to group any number of lines together.

Note: you can implement this recipe by hand, or you can optionally install a third-party library that implements this recipe for you, i.e. pip install more_itertools.

Code

from more_itertools import grouper


def group_lines(iterable, n=2):
    return ["\n".join((line for line in lines if line))
                    for lines in grouper(n, iterable.split("\n"), fillvalue="")]

Demo

s1 = "Line 1\nLine 2\nLine 3"
s2 = "Line 1\nLine 2\nLine 3\nLine4\nLine5"


group_lines(s1)
# ['Line 1\nLine 2', 'Line 3']

group_lines(s2)
# ['Line 1\nLine 2', 'Line 3\nLine4', 'Line5']

group_lines(s2, n=3)
# ['Line 1\nLine 2\nLine 3', 'Line4\nLine5']

Details

group_lines() splits the string into lines and then groups the lines by n via grouper.

list(grouper(2, s1.split("\n"), fillvalue=""))
[('Line 1', 'Line 2'), ('Line 3', '')]

Finally, for each group of lines, only non-emptry strings are rejoined with a newline character.

See more_itertools docs for more details on grouper.

Comments

1

I'm hoping I get your logic right - If you want a list of string, each with at most one newline delimiter, then the following code snippet will work:

# Newline-delimited string
a = "Line 1\nLine 2\nLine 3\nLine 4\nLine 5\nLine 6\nLine 7"
# Resulting list
b = []

# First split the string into "1-line-long" pieces
a = a.split("\n")

for i in range(1, len(a), 2):

    # Then join the pieces by 2's and append to the resulting list
    b.append(a[i - 1] + "\n" + a[i]) 

    # Account for the possibility of an odd-sized list
    if i == len(a) - 2: 
        b.append(a[i + 1])

print(b)

>>> ['Line 1\nLine 2', 'Line 3\nLine 4', 'Line 5\nLine 6', 'Line 7']

Although this solution isn't the fastest nor the best, it's easy to understand and it does not involve extra libraries.

Comments

1

I wanted to post the grouper recipe from the itertools docs as well, but PyToolz' partition_all is actually a bit nicer.

from toolz import partition_all

s = "Line 1\nLine 2\nLine 3\nLine 4\nLine 5"
result = ['\n'.join(tup) for tup in partition_all(2, s.splitlines())]
# ['Line 1\nLine 2', 'Line 3\nLine 4', 'Line 5']

Here's the grouper solution for the sake of completeness:

from itertools import zip_longest

# Recipe from the itertools docs.
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

result = ['\n'.join((a, b)) if b else a for a, b in grouper(s, 2)]

Comments

-1

Use str.splitlines() to split the full input into lines:

>>> inputString = "Line 1\nLine 2\nLine 3"
>>> outputStringList = inputString.splitlines()
>>> print(outputStringList)
['Line 1', 'Line 2', 'Line 3']

Then, join the first lines to obtain the desired result:

>>> result = ['\n'.join(outputStringList[:-1])] + outputStringList[-1:]
>>> print(result)
['Line 1\nLine 2', 'Line 3']

Bonus: write a function that do the same, for any number of desired lines:

def split_to_max_lines(inputStr, n):
    lines = inputStr.splitlines()
    # This define which element  in the list become the 2nd in the
    # final result. For n = 2, index = -1, for n = 4, index = -3, etc.
    split_index = -(n - 1)
    result = ['\n'.join(lines[:split_index])]
    result += lines[split_index:]
    return result

print(split_to_max_lines("Line 1\nLine 2\nLine 3\nline 4\nLine 5\nLine 6", 2))
print(split_to_max_lines("Line 1\nLine 2\nLine 3\nline 4\nLine 5\nLine 6", 4))
print(split_to_max_lines("Line 1\nLine 2\nLine 3\nline 4\nLine 5\nLine 6", 5))

Returns:

['Line 1\nLine 2\nLine 3\nline 4\nLine 5', 'Line 6']
['Line 1\nLine 2\nLine 3', 'line 4', 'Line 5', 'Line 6']
['Line 1\nLine 2', 'Line 3', 'line 4', 'Line 5', 'Line 6']

Comments

-2

I'm not sure what you mean by "a maximum of 2 lines" and how you'd hope to achieve that. However, splitting on newlines is fairly simple.

'Line 1\nLine 2\nLine 3'.split('\n')

This will result in:

['line 1', 'line 2', 'line 3']

To get the weird allowance for "some" line splitting, you'll have to write your own logic for that.

Comments

-2
b = "a\nb\nc\nd".split("\n", 3)
c = ["\n".join(b[:-1]), b[-1]]
print c

gives

['a\nb\nc', 'd']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.