0

I want to use preg_replace on a string, but although the string does not match, I get an empty string as return string.

PHP Code:
    $sql = "k1 LIKE 'n' OR k2 LIKE 'n' OR k3 LIKE 'n' OR k4 LIKE 'n' OR k5 LIKE 'n' OR k6 LIKE '1' ";
    print "SQL: $sql<br>";
    $sql_A = preg_replace("/^([\w]+ LIKE\s?'?.*?'? OR )+ $/", "##$1##", $sql);
    print "=> $sql_A<br>";

returns:

SQL: k1 LIKE 'n' OR k2 LIKE 'n' OR k3 LIKE 'n' OR k4 LIKE 'n' OR k5 LIKE 'n' OR k6 LIKE '1'
=>

The weird thing is, the regex doesn't even match. I tried to simplify the string the most I could while getting the same result. I also might add an OR just before the end-$. But as soon as I drop some more or add sth. else, I get the normal behavior and the right replacement.

Does anyone see why this happenes? Maybe I made a mistake I'm not seeing, maybe this is a bug? I'm lost...

(Using PHP Version 5.3.21 on a LINUX Server)

--- 2013-04-24 21:30 ---

Edit:

If it helps, my originally used data was:

$sql = "SELECT count(*) AS anz FROM xxx_table LEFT JOIN yyy_table on yyy_table.devo=xxx_table.devo LEFT JOIN zzz_table ON (yyy_table.status=zzz_table.sid AND lang=1 AND yyy_table.for =zzz_table.for) LEFT JOIN kunde k on k.kd_nr = yyy_table.kd_nr LEFT JOIN locations l on l.lok_nr = yyy_table.lok_nr WHERE xxx_table.clear!='Y' AND xxx_table.ign!='Y' AND name NOT LIKE 'Container:%' AND name LIKE '%' AND event_prio LIKE '%' AND zzz_table.show='Y' AND ( k.ikd_nr LIKE '%n%' OR k.such LIKE '%n%' OR k.adr1 LIKE '%n%' OR k.adr2 LIKE '%n%' OR k.ort LIKE '%n%' OR k.strasse LIKE '%n%' )";
$sql_A = preg_replace("/(\((\s*[\w\.\-\`]+ (LIKE\s|=)\s*'?.+?'?\s*OR)* [\w\.\-\`]+ LIKE '%' (OR [\w\.\-\`]+ (LIKE\s|=)\s*'?.+?'?\s*)*\))/i", "", $sql);

And again, the result was NULL (not an empty string as I have learned ;) (Purpuse of this preg_replace is, to identify OR-conditions in SQL statements with LIKE '%' patterns to remove those unnecessary OR-conditions.)

12
  • 1
    If matches are found, the new subject will be returned, otherwise subject will be returned unchanged or NULL if an error occurred. (php.net/manual/en/function.preg-replace.php) ... so there is probably an error with the pattern. The main one might be [\w] which should be \w in PHP. There are a few more issues with the pattern though. What is the desired output? Commented Apr 24, 2013 at 18:34
  • With the anchors, doesn't this expression require the string to end with OR ("OR" plus two spaces)? Commented Apr 24, 2013 at 18:44
  • 1
    @Wiseguy it does... if it was working it would also replace the entire string with the last capture. That's what I meant with "there are a few more issuess"... but without the desired output, trying to fix it is just wild guesses Commented Apr 24, 2013 at 18:46
  • 2
    @m.buettner Agreed. I wasn't addressing you specifically, just pointing out my first observation about the pattern. To address you specifically, I don't think [\w] would be a problem. The [ ] is unnecessary, but I believe it's still functionally equivalent to simply \w. Commented Apr 24, 2013 at 18:52
  • 1
    my desired output is the original $sql string. Because there is no match, I don't want any changes. Commented Apr 24, 2013 at 19:17

2 Answers 2

4

Your regex is built in a way that it consumes enormous amounts of backtracks:

 echo preg_ last_ error()

Yields 2 (PREG_BACKTRACK_LIMIT_ERROR)

You could of course increase the backtrack:

 ini_set("pcre.backtrack_limit",100000000);

... which would make this regex not work (as it doesn't match), but it would at least return the original string. Creating a more efficient regex seems more attractive, but to make a stab at that I would need a desired input & output.

edit:

Looking at it some more, I think this regex might help you on your way a little:

$sql_A = preg_replace(
   "/(\w+ LIKE\s*('(\\\.|[^'\\\]|)*'|[^\s]+))/",
   "##<$1>##",
   $sql);
Sign up to request clarification or add additional context in comments.

10 Comments

Oh. I never that about backtracks... I think I have to read a bit about it. Don't even know, what a backtrack is or why it is produced...
Backtracks is what the regex engine uses internally to verify a match (it's just what makes them work). The main problem in your regex is the '?.*?'?, which not really limits the match-possibilities, making the engine having to check a lot of permutations before arriving at the correct one (or none as now).
Thanks a lot! I got the idea now. So I'll try to use Wrikkens (x|y) suggestion in future instead of stuff like '?.*?'?
@Wrikken Just to clearlify, you mean, my problem is/was that there were too many backtracks? I tried the (x|y) for the quotes now and I get the an output. But as I mentioned before, when I only change some other small parts of the regex, I did get an output before as well. (But it would seem logical for me know, that the backtracks caused the NULL return.)
|
0

try this:

$sql = "k1 LIKE 'n' OR k2 LIKE 'n' OR k3 LIKE 'n' OR k4 LIKE 'n' OR k5 LIKE 'n' OR k6 LIKE '1' ";
$pattern = "~(?>\w++(?>\.\w++)?\h++LIKE\h++'[^']++'\h++OR\h++)*+\w++(?>\.\w++)?\h++LIKE\h++'[^']++'~";
$result = preg_replace($pattern, "##$0##",$sql);
echo '<pre>' . print_r($result, true) . '</pre>';

1 Comment

This works. But I don't understand parts of the regex. The tilde is used as a delimitar as far as I see. But is (?> some kind of look-ahead/behind? (Can't find it with a search enginges as they just skip those letters:) And what does \w++ do? (has ++ a special meaning)?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.