2

Is there an easy way to split text into separate lines each time a specific type of font arises. For example, I have text that looks like this:

BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible.

I'd like to split the text into lines for each speaker:

BILLY: The sky is blue.
SALLY: It really is blue.
SAM: I think it looks like this: terrible.

The speaker is always capitalized with a colon following the name.

2 Answers 2

11
import re
a="BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible."
print re.split(r"\s(?=[A-Z]+:)",a)

You can use re.split for this.

Output:['BILLY: The sky is blue.', 'SALLY: It really is blue.', 'SAM: I think it looks like this: terrible.']

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I'm new to this, so this is extremely helpful. I appreciate it!
3

If you just want to change the text rather than have a list, you could do the following:

import re

text = "BILLY: The sky is blue. SALLY: It really is blue. SAM: I think it looks like this: terrible."
print re.sub(r'([A-Z]+\:)', r'\n\1', text).lstrip()

This would print:

BILLY: The sky is blue. 
SALLY: It really is blue. 
SAM: I think it looks like this: terrible.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.