Python regex multi line matching

Question

I have to match multiple lines in python.

group one start
line 1 data
group end
group two start
group two data
group end

on the above string how to get below output

[group one start \n line 1 data \n group end, group two start \n group two data \n group end]

I have tried below code but not working

import re 

re.findall(r'group.*start.*group end',re.MULTILINE | re.DOTALL)

for info in data:
   print info

It's not clear what you are asking, the expected output you have provided is not valid python — Iain Shelvington
– Iain Shelvington, Commented Jul 28, 2019 at 2:39

Emma Marcier · Accepted Answer · 2019-07-28 03:30:15Z

2

Maybe, an expression somewhat similar to:

\bgroup [\s\S]*? start\b[\s\S]*?\bgroup end\b

DEMO 1

or:

\bgroup .*? start\b.*?\bgroup end\b

DEMO 2

with a DOTALL flag might be working here.

Test with `DOTALL`:

import re

regex = r"\bgroup .*? start\b.*?\bgroup end\b"

test_str = """
group one start
line 1 data
group end
group two start
group two data
group end
"""

print(re.findall(regex, test_str, re.DOTALL))

Test without `DOTALL`:

import re

regex = r"(\bgroup [\s\S]*? start\b[\s\S]*?\bgroup end\b)"

test_str = """
group one start
line 1 data
group end
group two start
group two data
group end

"""


print(re.findall(regex, test_str))

Output

['group one start\nline 1 data\ngroup end', 'group two start\ngroup two data\ngroup end']

The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.

edited Jul 28, 2019 at 3:30

answered Jul 28, 2019 at 2:40

Emma Marcier

27.8k12 gold badges49 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Gowtham Saminathan Over a year ago

but how to get array of output

Sunitha · Accepted Answer · 2019-07-28 05:04:02Z

0

You can just split the text based on the pattern group end, but without capturing it using a look-behind

>>> import re
>>> text_data = """group one start
... line 1 data
... group end
... group two start
... group two data
... group end"""
>>> 
>>> re.split(r'(?<=group end)\n', text_data)
['group one start\nline 1 data\ngroup end', 'group two start\ngroup two data\ngroup end']

answered Jul 28, 2019 at 5:04

Sunitha

12.1k2 gold badges23 silver badges23 bronze badges

Comments

Gowtham Saminathan · Accepted Answer · 2019-07-30 01:42:40Z

0

Below code working for me

a = """group one start
line 1 data
group end
group two start
group two data
group end
"""
all_m = re.findall(r'group.*?start.*?group end',a,re.DOTALL)
for m in all_m:
    print(m)
    print("**********")

Output

group one start
line 1 data
group end
*************
group two start
group two data
group end
*************

answered Jul 30, 2019 at 1:42

Gowtham Saminathan

451 silver badge7 bronze badges

Collectives™ on Stack Overflow

Python regex multi line matching

3 Answers 3

Test with `DOTALL`:

Test without `DOTALL`:

Output

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Test with DOTALL:

Test without DOTALL:

Output

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related

Test with `DOTALL`:

Test without `DOTALL`: