2

I'm extracting a string from wikipedia API that initially looks like this: link text. I want to peel off all {{...}} and everything in between them (could be any kind of text). For that I thought about using a recursive function with "preg_match","preg_replace". something like:

function drop_brax($text)
{
    if(preg_match('/{{(.)*}}/',$text)) 
    return drop_brax(preg_replace('/{{(.)*}}/','',$text));
    return $text;
}

This function will not work because of a situation like this:

{{ I like mocachino {{ but I also like banana}} and frutis }}

this will peel off everything between the first occurence of both {{ and }} (and leave out "and frutis }}"). How can I do this properly? (while maintaining the nice recursive form).

2 Answers 2

6

Try something like this:

$text = '...{{aa{{bb}}cc}}...{{aa{{bb{{cc}}bb{{cc}}bb}}dd}}...';
preg_match_all('/\{\{(?:[^{}]|(?R))*}}/', $text, $matches);
print_r($matches);

output:

Array
(
    [0] => Array
        (
            [0] => {{aa{{bb}}cc}}
            [1] => {{aa{{bb{{cc}}bb{{cc}}bb}}dd}}
        )
)

And a short explanation:

\{\{      # match two opening brackets
(?:       # start non-capturing group 1
  [^{}]   #   match any character except '{' and '}'
  |       #   OR
  (?R)    #   recursively call the entire pattern: \{\{(?:[^{}]|(?R))*}}
)         # end non-capturing group 1
*         # repeat non-capturing group 1 zero or more times
}}        # match two closing brackets
Sign up to request clarification or add additional context in comments.

1 Comment

I tried it, so far so good, I'm gonna give it a couple more test. Thank you very much!
0

to have this fully recursive you will need a parser:

function drop_brax($str)
{
    $buffer = NULL;
    $depth = 0;
    $strlen_str = strlen($str);
    for($i = 0; $i < $strlen_str; $i++)
    {
        $char = $str[$i];

        switch ($char)
        {
            case '{':
                $depth++;
            break;
            case '}':
                $depth--;
            break;
            default:
                $buffer .= ($depth === 0) ? $char : NULL;
        }
    }
    return $buffer;
}

$str = 'some text {{ I like mocachino {{ but I also like banana}} and frutis }} some text';
$str = drop_brax($str);
echo $str;

output:

some text some text

1 Comment

I tried both your suggestion and Bart K.'s, evidently his was quicker in performance. Nevertheless thanks a lot for your help! i appreciate it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.