2

I have a string like this:

----------

FT Weekend

----------

Why do we run marathons?
Are marathons and cycling races about more than exercise? What does the 
literature of endurance tell us about our thirst for self-imposed hardship? 

I want to delete the part from ---------- to the next ---------- included.

I have been using re.sub:

pattern =r"-+\n.+\n-+"
re.sub(pattern, '', thestring)
1
  • 1
    could you be more precise Commented Jul 14, 2015 at 9:48

3 Answers 3

4
pattern =r"-+\n.+?\n-+"
re.sub(pattern, '', thestring,flags=re.DOTALL)

Just use DOTALL flag.The problem with your regex was that by default . does not match \n.So you need to explicitly add a flag DOTALL making it match \n.

See demo.

https://regex101.com/r/hR7tH4/24

or

pattern =r"-+\n[\s\S]+?\n-+"
re.sub(pattern, '', thestring)

if you dont want to add a flag

Sign up to request clarification or add additional context in comments.

1 Comment

@stribizhev yups .make it non greedy
2

Your regex doesn't match the expected part because .+ doesn't capture new line character. you can use re.DOTALL flag to forced . to match newlines or re.S.but instead of that You can use a negated character class :

>>> print re.sub(r"-+[^-]+-+", '', s)
''

Why do we run marathons?
Are marathons and cycling races about more than exercise? What does the 
literature of endurance tell us about our thirst for self-imposed hardship? 
>>> 

Or more precise you can do:

>>> print re.sub(r"-+[^-]+-+[^\w]+", '', s)
'Why do we run marathons?
Are marathons and cycling races about more than exercise? What does the 
literature of endurance tell us about our thirst for self-imposed hardship? 
>>> 

Comments

0

The problem with your regex (-+\n.+\n-+) is that . matches any character but a newline, and that it is too greedy (.+), and can span across multiple ------- entities.

You can use the following regex:

pattern = r"(?s)-+\n.+?\n-+"

The (?s) singleline option makes . match any character including newline. The .+? pattern will match 1 or more characters but as few as possible to match up to the next ----.

See IDEONE demo

For a more profound cleanup, I'd recommend:

pattern = r"(?s)\s*-+\n.+?\n-+\s*"

See another demo

1 Comment

Not sure if you have single quotes, or not, I included them into my 2nd recommended regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.