5

I'm trying to make a PHP regex to extract functions from php source code. Until now i used a recursive regex to extract everything between {} but then it also matches stuff like if statements. When i use something like:

preg_match_all("/(function .*\(.*\))({([^{}]+|(?R))*})/", $data, $matches);

It doesn't work when there is more than 1 function in the file (probably because it uses the 'function' part in the recursiveness too).

Is there any way to do this?

Example file:

<?php
if($useless)
{
  echo "i don't want this";
}

function bla($wut)
{
  echo "i do want this";
}
?>

Thanks

2
  • Consider this: $s = 'foo function() bar'; and /* no function() */ and think about the fact that stuff like /* can also be placed inside string literals (and vice versa). In short: don't do this using regex (see stereofrog's answer). Commented Mar 21, 2010 at 20:12
  • As an alternative to regular expressions (which can never correctly handle all edge cases, this answer explains how to use a PHP parser written in PHP to extract a function from a piece of code: stackoverflow.com/a/66907961/101087 Commented Apr 1, 2021 at 16:49

3 Answers 3

6

regexps is the wrong way to do it. Consider tokenizer or reflection

Sign up to request clarification or add additional context in comments.

Comments

5

Moved here from duplicate question: PHP, Regex and new lines

Regex solution:

$regex = '~
  function                 #function keyword
  \s+                      #any number of whitespaces 
  (?P<function_name>.*?)   #function name itself
  \s*                      #optional white spaces
  (?P<parameters>\(.*?\))  #function parameters
  \s*                      #optional white spaces
  (?P<body>\{.*?\})        #body of a function
~six';

if (preg_match_all($regex, $input, $matches)) {
  print_r($matches);
}

P.S. As was suggested above tokenizer is preferable way to go.

Comments

0

Regex accepting recursive curly brackets in body

I know there is a selected answer, but in case tokenizer can not be used this is a simple regex to extract function (name, param and body) from php code.

Main difference with Ioseb answer above is that this regex accepts cases with recursive curly brackets in the body, means that it won't stop after the first curly brackets closing.

/function\s+(?<name>\w+)\s*\((?<param>[^\)]*)\)\s*(?<body>\{(?:[^{}]+|(?&body))*\})/

Explanation

/                                   # delimiter
function                            # function keyword
\s+                                 # at least one whitespace
(?<name>\w+)                        # function name (a word) => group "name"
\s*                                 # optional whitespace(s)
\((?<param>[^\)]*)\)                # function parameters => group "param"
\s*                                 # optional whitespace(s)
(?<body>\{(?:[^{}]+|(?&body))*\})   # body function (recursive curly brackets allowed)  => group "body"
/                                   # delimiter

Example

$data = '
    <?php 
    function my_function($param){
        if($param === true){
            // This is true
        }else if($param === false){
            // This is false
        }else{
            // This is not
        }
    }
    ?>
';

preg_match_all("/function\s+(?<name>\w+)\s*\((?<param>[^\)]*)\)\s*(?<body>\{(?:[^{}]+|(?&body))*\})/", $data, $matches);
print_r($matches['body']);

/*
Array
(
    [0] => {
        if($param === true){
            // This is true
        }else if($param === false){
            // This is false
        }else{
            // This is not
        }
    }
)
*/

Limitation

Curly brackets have to be balanced. ie, this body will be partially extracted :

function my_function(){
    echo "A curly bracket : }";
    echo "Another curly bracket : {";
}

/*
Array
(
    [0] => {
    echo "A curly bracket : }
)
*/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.