1

Let's say we have text within which some quotes are stored in the form:

user:quote

we can have multiple quotes within a text.

Agatha Drake: She records her videos from the future? What is she, a
  f**ing time lord? Is she Michael J. Fox?

Harvey Spencer: This is just like that one movie where that one guy
  changed one tiny, little thing in his childhood to stop the girl of
  his dreams from being a crackhead in the future!

How can i extract the quotes (She records her videos from ..., This is just like that one movie....) from the text in python?

I tried

re.findall('\S\:\s?(.*)', text)

But it's not doing the job.

https://regex101.com/r/vH63Go/1

How can I do it in Python?

2
  • Is a user always at the start of a line? (?m)^[^:\n]+:\s?((?:.+\n?)*) would be my approach then. Commented Nov 19, 2016 at 20:23
  • Thank you @Sebastian Proske. this is what i wanted Commented Nov 19, 2016 at 20:32

1 Answer 1

1

If your string is following the consistent format of user at the start of a line and double newlines ending a quote, you could use this:

(?m)^[^:\n]+:\s?((?:.+\n?)*)

It uses multiline mode and matches the start of a line, followed by characters that are neither : nor newline, folllowed by :. Then captures all following lines with content.

Here's a demo on regex101.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.