2

I'm trying to build a pattern for a multiline string, that must start with <?php or whitespace + <?php and NOT end with ?> or ?> + whitespace.

My attempt was /^\s?<\?php.*[^>]\s?$/s but it did not work. Also tried the negative lookahead - no use.

Any idea? Thanks in advance.

2
  • If you just trim() the string first, this gets a lot easier and could be done with simple str_pos calls Commented Jun 11, 2015 at 21:05
  • Not possible in my case. Here I just prepare patterns to pass them over to another script. Commented Jun 12, 2015 at 9:02

2 Answers 2

3

You can use

(?s)^\s*<\?php(?!.*\?>\s*$).*+$

See demo

Regex explanation:

  • (?s) - Enable singleline mode for the whole pattern for . to match newline
  • ^ - Start of string
  • \s* - Optional whitespace, 0 or more repetitions
  • <\?php - Literal <?php
  • (?!.*\?>\s*$) - Look-ahead checking if the string does not end with ?>whitespace
  • .*+$ - Matches without backtracking any characters up to the string end.

The possessive quantifier (as in .*+) enables us to consume characters once, in 1 go, and never come back in search of possible permutations.

Possessive quantifiers are a way to prevent the regex engine from trying all permutations. This is primarily useful for performance reasons.

And we do not to use explicit SKIP-FAIL verbs then.

Sign up to request clarification or add additional context in comments.

4 Comments

I think OP wants NOT end with ?> or ?>whitespace*
Yes it works though I am not a fan of double negative lookbehind. However honestly I didn't downvote your answer someone else did.
@anubhava: Now it should
Yes now it does and much better than 2 lookbehinds +1
2

In PHP, you can use this regex:

'/^\s*<\?php(?:.(?!\?>\s*$))*$/s'

RegEx Demo

  • ^\s*<\?php matches optional whitespaces and literal <?php at line start.
  • (?:.(?!\?>\s*$))* will match any 0 or more characters that don't end with ?>whitespace* using a negative lookahead.

Update: For efficiency this PCRE regex will perform faster than the previous one:

'/^\s*<\?php(?>.*\?>\s*$(*SKIP)(*F)|.*+$)/s'

RegEx Demo 2

  • (*FAIL) behaves like a failing negative assertion and is a synonym for (?!)
  • (*SKIP) defines a point beyond which the regex engine is not allowed to backtrack when the subpattern fails later
  • (*SKIP)(*FAIL) together provide a nice alternative of restriction that you cannot have a variable length lookbehind in above regex.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.