1

"<>THIS is the place to stay at when visiting the historical area of Seattle.

Your right on the water front near the ferry's and great sea food hotel.

The breakfast was great. <>"

Above is my sample text. I want to print the strings fall in between <> & <>. I want my output to be free of new line character \n, like this:

THIS is the place to stay at when visiting the historical area of Seattle. Your right on the water front near the ferry's and great sea food hotel.The breakfast was great.

I have tried the following piece of code:

import re
pattern = re.compile(r'\<>(.+?)\<>',re.DOTALL|re.MULTILINE)
text = """<>THIS is the place to stay at when visiting the historical area of Seattle.

Your right on the water front near the ferry's and great sea food hotel.

The breakfast was great.
<>"""
results = pattern.findall(text)
print results

But I am getting results like this :

["THIS is the place to stay at when visiting the historical area of Seattle.\n\nYour right on the water front near the ferry's and great sea food hotel.\n\nThe breakfast was great.\n"]

But I don't want any new line characters in my resulting string.

3
  • 3
    Just use .replace("\n", "") on each found match. See ideone.com/2i5Rl8 Commented Jun 14, 2016 at 10:31
  • Both answers ideas look similar, but the question is not clear in if either the list shall remain (then Wiktors answer is the best match) or there shall be one string at the end, then UpZone's answer solves that. In any case both answers work I guess ;-) Commented Jun 14, 2016 at 10:41
  • but i dont want any extra piece of code to slow down the processing.. Can i combine it with the pattern = re.compile(r'\<>(.+?)\<>',re.DOTALL|re.MULTILINE) Commented Jun 14, 2016 at 10:51

2 Answers 2

4

Use .replace("\n", "") on each found match (use comprehension) to replace any newline with an empty string.

See the demo:

results = [x.replace("\n", "") for x in pattern.findall(text)]
# => ["THIS is the place to stay at when visiting the historical area of Seattle.Your right on the water front near the ferry's and great sea food hotel.The breakfast was great."]
Sign up to request clarification or add additional context in comments.

3 Comments

but i dont want any extra piece of code to slow down the processing.. Can i combine it with the pattern = re.compile(r'\<>(.+?)\<>',re.DOTALL|re.MULTILINE)
@albinantony: no, you cannot match discontinuous text within one match operation.
I have written a general article about extracting strings between two strings with regex, too, feel free to read if you have a problem approaching your current similar problem.
3

just replace those characters you don't want

e.g.

result_without_newline = str(result).replace('\n', '')

hope this helps :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.