2

I have string like this:

12abcc?p_auth=123ABC&ABC&s

Start of symbol is "p_auth=" and end of string first "&" symbol.

P.S symbol '&' and 'p_auth=' must not be included.

I have wrote that regex:

(p_auth).+?(?=&)

Ok, thats works well, it gets that sub-string:

p_auth=123ABC

bot how to get string without 'p_auth'?

2 Answers 2

2

Use look-arounds:

(?<=p_auth=).*?(?=&)

See regex demo

The look-behind (?<=p_auth=) and the look-ahead (?=&) do not consume characters as they are zero-width assertions. They just check for the substring presence either before or after a certain subpattern.

A couple more words about (?<=p_auth=). It is a positive look-behind. Positive because it require a pattern inside it to appear on the left, before the "main" subpattern. If the look-behind subpattern is found, the result is just "true" and the regex goes on checking the rest of subpatterns. If not, the match is failed, the engine goes on looking for another match at the next index.

Here is some description from regular-expressions.info:

It [the look-behind] tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (?<!a)b matches a "b" that is not preceded by an "a", using negative lookbehind. It doesn't match cab, but matches the b (and only the b) in bed or debt. (?<=a)b (positive lookbehind) matches the b (and only the b) in cab, but does not match bed or debt.

In most cases, you do not really need look-arounds. In this case, you could just use a

p_auth(.*?)&

And get the first capturing group value.

The .*? pattern will look for any number of characters other than a newline, but as few as possible that are required to find a match. It is called lazy dot matching, because the ? symbol makes the * quantifier stop before the first symbol that is matched by the subsequent subpattern in the regular expression.

The .*& would match all the substring until the last & because * quantifier is greedy - it will consume as many characters it can match as possible.

See more at Repetition with Star and Plus regular-expressions.info page.

Sign up to request clarification or add additional context in comments.

7 Comments

Thanks. Could you tell me what does (?<=p_auth=) mean? I want to understand regex inside out.
I added more explanation, I hope you have a clearer idea of what a look-behind is. Note it is rather expensive in terms of effeciency to use a look-behind. You should really check if just a capturing group solution works for you best. I added the capturing group solution, too.
Everything is clear! thanks! I have one addition questions too. '.*(?=&)' finds characters until symbol '&'. But if we add question marks '.*?(?=&)' it will finds first match. I found this trick by mistake. Is my logic correct? should we always use question mark to get first match? Or do we have any another method? p.s please link us the best resource or book for regex learning. thanks!
You are asking about the difference between the greedy and lazy quantifier. I added more explanations.
Also, see regex SO tag description (with many other links to great online resources), and the community SO post called What does the regex mean
|
0
p_auth(.+?)(?=&)

Simply use this and grab the group 1 or capture 1.

1 Comment

I wanted this (?<=p_auth=)(.+?)(?=&). Thank you for your answer too :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.