0

Need some help with the regex to be used for extracting string between a start_pattern and an end_pattern. Additionally, the regex should grep all characters upto the end of line if no end_pattern exists.

Sample 1 : "BOOK1:book1A,book1B,book1C,book1D" 

Expected Result : book1A,book1B,book1C,book1D

Sample 2 : "BOOK1:book1A,book1B,book1C,book1D|BOOK2:book2A,book2B,book2C,book2DA"

Expected Result : (1)book1A,book1B,book1C,book1D (2)book2A,book2B,book2C,book2DA

I've managed to resolve the regex (shown below) when the string terminator is "|", but cannot get around to resolving it when there is no terminator

(?<=BOOK1:).*(?=\|)
1
  • You may want to add a question mark at the end: (?<=BOOK1:).*?(?=\|)?, also you should use the non-greedy .*? instead of the greedy .* unless you want "Sample 2" to match book1A,book1B,book1C,book1D|BOOK2:book2A,book2B,book2C,book2DA Commented Jul 4, 2013 at 14:29

1 Answer 1

2

Use $ and change .* to .*?

(?<=BOOK1:|\|).*?(?=\||$)

$ marks the end of line or string

.*? would match lazily


For example, for input

a|b|c|d|e

with regex

(?<=\|).*(?=\|)

it would match b|c|d

with regex

(?<=\|).*?(?=\|)

it would match

b
c
d
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.