Regex accepting recursive curly brackets in body
I know there is a selected answer, but in case tokenizer can not be used this is a simple regex to extract function (name, param and body) from php code.
Main difference with Ioseb answer above is that this regex accepts cases with recursive curly brackets in the body, means that it won't stop after the first curly brackets closing.
/function\s+(?<name>\w+)\s*\((?<param>[^\)]*)\)\s*(?<body>\{(?:[^{}]+|(?&body))*\})/
Explanation
/ # delimiter
function # function keyword
\s+ # at least one whitespace
(?<name>\w+) # function name (a word) => group "name"
\s* # optional whitespace(s)
\((?<param>[^\)]*)\) # function parameters => group "param"
\s* # optional whitespace(s)
(?<body>\{(?:[^{}]+|(?&body))*\}) # body function (recursive curly brackets allowed) => group "body"
/ # delimiter
Example
$data = '
<?php
function my_function($param){
if($param === true){
// This is true
}else if($param === false){
// This is false
}else{
// This is not
}
}
?>
';
preg_match_all("/function\s+(?<name>\w+)\s*\((?<param>[^\)]*)\)\s*(?<body>\{(?:[^{}]+|(?&body))*\})/", $data, $matches);
print_r($matches['body']);
/*
Array
(
[0] => {
if($param === true){
// This is true
}else if($param === false){
// This is false
}else{
// This is not
}
}
)
*/
Limitation
Curly brackets have to be balanced.
ie, this body will be partially extracted :
function my_function(){
echo "A curly bracket : }";
echo "Another curly bracket : {";
}
/*
Array
(
[0] => {
echo "A curly bracket : }
)
*/
$s = 'foo function() bar';and/* no function() */and think about the fact that stuff like/*can also be placed inside string literals (and vice versa). In short: don't do this using regex (see stereofrog's answer).