0

There are three DAYs described by text variable:

text = """
DAY {
 foo 12 5 A
 foo 
 12345
}
DAY {
 day 1
 day 2
 file = "/Users/Shared/docs/doc.txt"
 day 3
 end of the month
}
DAY {
 01.03.2016 11:15
 01.03.2016 11:16
 01.03.2016 11:17
}"""

All three DAY definitions begin with the word DAY (at the beginning of line), then a space and a curly bracket. The end is indicated with the closing bracket always placed at the beginning of the line. So we can say the boundaries of each DAY is defined within the curly brackets {}.

Using regex I need to "find" the DAY that contains file = "/Users/Shared/docs/doc.txt" line inside of its boundary.

I started writing a regex expression:

string = """DAY {\n [A-Za-z0-9]+}"""

result = re.findall(string, text)

But the expression stops finding the text at the end of foo right before the white space character. How to modify the expression so it returns the second DAY that has file = "/Users/Shared/docs/doc.txt" in its body, so the result would look like:

DAY {
 day 1
 day 2
 file = "/Users/Shared/docs/doc.txt"
 day 3
 end of the month
}

1 Answer 1

1

To perform regular expression matching on multiline text, you need to compile your regex with parameter re.MULTILINE.

This piece of code should work as you requested.

regex = re.compile("""(DAY\s*\{[^\{\}]*file\ \=\ \"/Users/Shared/docs/doc\.txt\"[^\{\}]*\})""", re.MULTILINE)
regex.findall(text)

Result:

['DAY {\n day 1\n day 2\n file = "/Users/Shared/docs/doc.txt"\n day 3\n end of the month\n}']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.