1

I would like to define a function make_function that returns a new function. It takes as arguments a list arg_names of argument names for the new function and a function inner_func to be used in the definition of the new function. The new function will just add 5 to the output of inner_func in this simple case.

I tried using eval():

def make_function(args_names: List[str], inner_func: Callable):
    args_str = str.join(", ", args_names)
    expr: str = (
        "lambda "
        + args_str
        + ": "
        + inner_func.__name__
        + "("
        + args_str
        + ") + 5"
    )

    return eval(expr)

This works. But using eval is not recommended. For once, it is not easy to debug. Also, in my use case I need to call the new function from a place where inner_func is not available, which raises an error. Any other options?

3
  • It's possible, but it's really difficult. The problem is trivial if you use *args, so, why not just use *args? Does it really matter what the argument names are, if the only thing you will use them for is to pass them all, in the same order, into the inner_func? Commented Nov 3, 2022 at 1:09
  • There is also the operator module that can compose operations from constitutents. See docs.python.org/3/library/operator.html#operator.add for example. And also ast.literal_eval. Not sure if it fits your use case. eval fits blackhats' use cases perfectly well however: it is a massive security hole. Commented Nov 3, 2022 at 3:36
  • 1
    @wim Unfortunately the exact names of the arguments do matter: there is a function of a package I'm using which extracts the names of the arguments and builds a graph with those names as nodes (the details don't matter here). Commented Nov 3, 2022 at 9:42

1 Answer 1

1

Perhaps exec is actually the best approach. eval/exec are usually not recommended if the input string is arbitrary/untrusted. However, when the code string which is eventually passed to the compiler was also generated by you, and not directly supplied by user, then it can be fine. There are some stdlib examples using this approach: namedtuple uses eval and dataclass uses exec, nobody has figured out how to exploit them (yet).

Now, I think that in your case it's fairly easy to do a safe code generation + exec simply by verifying that the args_names passed in is truly a list of arg names, and not some arbitrary Python code.

from textwrap import dedent
from typing import List, Callable

def make_function(args_names: List[str], inner_func: Callable):

    for arg_name in args_names:
        if not arg_name.isidentifier():
            raise Exception(f"Invalid arg name: {arg_name}")

    args = ', '.join(args_names)

    code_str = dedent(f"""\
        def outer_func({args}):
            return inner_func({args}) + 5
    """)

    scope = {"inner_func": inner_func}
    exec(code_str, scope)
    return scope["outer_func"]

Demo:

>>> def orig(a, b):
...     return a + b + 1
... 
>>> func = make_function(args_names=["foo", "bar"], inner_func=orig)
>>> func(2, 3)
11
>>> func(foo=2, bar=3)
11
>>> func(foo=2, bar=3, baz=4)
TypeError: outer_func() got an unexpected keyword argument 'baz'
>>> func(foo=2)
TypeError: outer_func() missing 1 required positional argument: 'bar'

As desired, it continues to work even when a local reference to the inner_func is no longer available, since we made sure the reference was available during code gen:

>>> del orig
>>> func(foo=2, bar=3)
11

Nefarious "argument names" are not allowed:

>>> make_function(["foo", "bar", "__import__('os')"], orig)
Exception: Invalid arg name: __import__('os')

For an approach without using code generation, it is also possible to instantiate types.FunctionType directly. To do this you need to pass it a types.CodeType instance, which are pretty difficult to create manually. These are public/documented, but the docstring for the code type even tries to scare you away:

>>> ((lambda: None).__code__.__doc__)
'Create a code object.  Not for the faint of heart.'

If you want to attempt it regardless, see How to create a code object in python? but I think you'll find that using eval, exec or compile is more convincing.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for a great, complete answer. One question: I find it surprising that it still works after you delete orig. If I do the same with my code it doesn't work, as I said in the question statement. Does this have to do with eval vs exec or something else?
The difference is that you use inner_func.__name__ in the lambda, so you're relying on the name binding. That name doesn't necessarily hang around. I'm using scope = {"inner_func": inner_func} so I pass the inner_func function instance itself. You can actually fix up the lambda approach by passing the second argument when you call eval, so that you are in control of the name lookup, but I think that using exec is cleaner and easier to understand here.
Note that using inner_func.__name__ also creates another unnecessary restriction: you can't pass in a lambda as the inner_func callable, it's an "anonymous" function and you will get a syntax error in the eval (it doesn't have a usable __name__ attribute). So use instead something like eval(..., {"_inner_func": inner_func}) to control the name lookup in your version.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.