0

I'm currently working on a small framework type of project. In this project a eval() is needed. This eval string is not user-submitted, but i would still like to validate that the string is a (contains a) variable.

The types of variable could be both normal variables, class properties and superglobal variables. I'm new to regex so I would appreciate any help.

Just to clarify: the string would be this as an example contain something like this '$_GET["something"]'.

5
  • 1
    The strict definition of regexes is definitely not capable of validating whether a string is valid PHP, JavaScript, etc. PHP has some extensions, and likely it is possible. But programming languages are validated/pared using a contex-free grammar; like this one for PHP. Commented Jun 1, 2015 at 19:09
  • 1
    Check out PHP_Parser in PEAR... pear.php.net/package/PHP_Parser Commented Jun 1, 2015 at 19:11
  • Note that $_GET["something"] is not "eval-safe", for example, $_GET["${`format c:`}"]. Commented Jun 1, 2015 at 19:38
  • 1
    "In this project a eval() is needed." – I'd take some time to seriously reconsider its necessity. Is eval() really (really, really, really) needed? Sharing more about the problem being "solved" by eval() might lead to more suitable alternatives. Commented Jun 1, 2015 at 19:46
  • Indeed, mind all kinds of side-effects, and eval is one of the easy ways to hack into a webserver server. Of course one needs to find a way to inect in eval, but in many cases, that's not undoable. A lot of scripting languages seem to regret the eval. It is easy to do a call to the interpreter; but it is opening the box of pandora. Commented Jun 1, 2015 at 19:49

2 Answers 2

1

You can use the following :

(\$[a-zA-Z_]\w*(\[(["'])\w+\3\])?|\$\{\w+\})

See DEMO

Note: It is better to use some libraries like this and this than using a regex solution. (From the discussions)

Sign up to request clarification or add additional context in comments.

8 Comments

I assume the something is a variable name, where variable names can't contain ] for instance?
system("rm -rf /"); // $foo also matches this regex (demo) — you might want some anchors
ASCII 228 characters may be accepted. Something like that: $täyte See: PHP variables
That's why it is in general a bad idea to do programming language processing yourself. Most of the concepts look easy, but in fact such languages are very complicated with all kinds of exceptions. Therefore one better uses a library.
@emartinelli $☁❄☃☀☂ is also perfectly valid.
|
1
\$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*(?:\[["']\w+["']\])?

DEMO

ps.: (Extended) ASCII 228 is accepted in PHP

Reference: php variables

2 Comments

I'm not sure about the \D. This will match a new line or space character as well. The funny thing is that the regex is in the reference you point at: \$[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*. But now the challenge remains how you are going to filter on comments, strings, etc.
I edited. But filter on comments, strings, etc. will be a problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.