0

I have to match multiple lines in python.

group one start
line 1 data
group end
group two start
group two data
group end

on the above string how to get below output

[group one start \n line 1 data \n group end, group two start \n group two data \n group end]

I have tried below code but not working

import re 

re.findall(r'group.*start.*group end',re.MULTILINE | re.DOTALL)

for info in data:
   print info
1
  • It's not clear what you are asking, the expected output you have provided is not valid python Commented Jul 28, 2019 at 2:39

3 Answers 3

2

Maybe, an expression somewhat similar to:

\bgroup [\s\S]*? start\b[\s\S]*?\bgroup end\b

DEMO 1

or:

\bgroup .*? start\b.*?\bgroup end\b

DEMO 2

with a DOTALL flag might be working here.

Test with DOTALL:

import re

regex = r"\bgroup .*? start\b.*?\bgroup end\b"

test_str = """
group one start
line 1 data
group end
group two start
group two data
group end
"""

print(re.findall(regex, test_str, re.DOTALL))

Test without DOTALL:

import re

regex = r"(\bgroup [\s\S]*? start\b[\s\S]*?\bgroup end\b)"

test_str = """
group one start
line 1 data
group end
group two start
group two data
group end

"""


print(re.findall(regex, test_str))

Output

['group one start\nline 1 data\ngroup end', 'group two start\ngroup two data\ngroup end']

The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.

Sign up to request clarification or add additional context in comments.

1 Comment

but how to get array of output
0

You can just split the text based on the pattern group end, but without capturing it using a look-behind

>>> import re
>>> text_data = """group one start
... line 1 data
... group end
... group two start
... group two data
... group end"""
>>> 
>>> re.split(r'(?<=group end)\n', text_data)
['group one start\nline 1 data\ngroup end', 'group two start\ngroup two data\ngroup end']

Comments

0

Below code working for me

a = """group one start
line 1 data
group end
group two start
group two data
group end
"""
all_m = re.findall(r'group.*?start.*?group end',a,re.DOTALL)
for m in all_m:
    print(m)
    print("**********")

Output

group one start
line 1 data
group end
*************
group two start
group two data
group end
*************

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.