4

I have a C function which signature looks like this:

typedef double (*func_t)(double*, int)
int some_f(func_t myFunc);

I would like to pass a Python function (not necessarily explicitly) as an argument for some_f. Unfortunately, I can't afford to alter declaration of some_f, that's it: I shouldn't change C code.

One obvious thing I tried to do is to create a basic wrapping function like this:

cdef double wraping_f(double *d, int i /*?, object f */):
     /*do stuff*/
     return <double>f(d_t)

However, I can't come up with a way to actually "put" it inside wrapping_f's body.

There is a very bad solution to this problem: I could use a global object variable, however this forces me copy-n-paste multiple instances of essentially same wrapper function that will use different global functions (I am planning to use multiple Python functions simultaneously).

1
  • Have a look at the second half of this answer: stackoverflow.com/a/34900829/4657412. The difficulty is that you want to have some state attached to your function pointer (so it knows what python object to call) and C function pointers cannot store state. Therefore, I believe it's genuinely impossible in standard C. The only way I've found to do it is to use ctypes or cffi which manage to do it using some (hidden) non-standard hacks (generating code at runtime) Commented Jun 26, 2018 at 16:45

2 Answers 2

3

I keep my other answer for historical reasons - it shows, that there is no way to do what you want without jit-compilation and helped me to understand how great @DavidW's advise in this answer was.

For the sake of simplicity, I use a slightly simpler signature of functions and trust you to change it accordingly to your needs.

Here is a blueprint for a closure, which lets ctypes do the jit-compilation behind the scenes:

%%cython
#needs Cython > 0.28 to run because of verbatim C-code 
cdef extern from *:   #fill some_t with life
    """
    typedef int (*func_t)(int);
    static int some_f(func_t fun){
        return fun(42);
    }
    """
    ctypedef int (*func_t)(int)
    int some_f(func_t myFunc)

#works with any recent Cython version:
import ctypes
cdef class Closure:
    cdef object python_fun
    cdef object jitted_wrapper

    def inner_fun(self, int arg):
        return self.python_fun(arg)

    def __cinit__(self, python_fun):
        self.python_fun=python_fun
        ftype = ctypes.CFUNCTYPE(ctypes.c_int,ctypes.c_int) #define signature
        self.jitted_wrapper=ftype(self.inner_fun)           #jit the wrapper

    cdef func_t get_fun_ptr(self):
        return (<func_t *><size_t>ctypes.addressof(self.jitted_wrapper))[0]

def use_closure(Closure closure):
    print(some_f(closure.get_fun_ptr()))

And now using it:

>>> cl1, cl2=Closure(lambda x:2*x), Closure(lambda x:3*x)
>>> use_closure(cl1)
84
>>> use_closure(cl2)
126
Sign up to request clarification or add additional context in comments.

Comments

0

This answer is more in Do-It-Yourself style and while not unintersting you should refer to my other answer for a concise recept.


This answer is a hack and a little bit over the top, it only works for Linux64 and probably should not be recommended - yet I just cannot stop myself from posting it.

There are actually four versions:

  • how easy the life could be, if the API would take the possibility of closures into consideration
  • using a global state to produce a single closure [also considered by you]
  • using multiple global states to produce multiple closures at the same time [also considered by you]
  • using jit-compiled functions to produce an arbitrary number of closures at the same time

For the sake of simplicity I chose a simpler signature of func_t - int (*func_t)(void).

I know, you cannot change the API. Yet I cannot embark on a journey full of pain, without mentioning how simple it could be... There is a quite common trick to fake closures with function pointers - just add an additional parameter to your API (normally void *), i.e:

#version 1: Life could be so easy
# needs Cython >= 0.28 because of verbatim C-code feature
%%cython 
cdef extern from *: #fill some_t with life
    """
    typedef int (*func_t)(void *);
    static int some_f(func_t fun, void *params){
        return fun(params);
    }
    """
    ctypedef int (*func_t)(void *)
    int some_f(func_t myFunc, void *params)

cdef int fun(void *obj):
    print(<object>obj)
    return len(<object>obj)

def doit(s):
    cdef void *params = <void*>s
    print(some_f(&fun, params))

We basically use void *params to pass the inner state of the closure to fun and so the result of fun can depend on this state.

The behavior is as expected:

>>> doit('A')
A
1

But alas, the API is how it is. We could use a global pointer and a wrapper to pass the information:

#version 2: Use global variable for information exchange
# needs Cython >= 0.28 because of verbatim C-code feature
%%cython 
cdef extern from *:
    """
    typedef int (*func_t)();
    static int some_f(func_t fun){
        return fun();
    }
    static void *obj_a=NULL;
    """
    ctypedef int (*func_t)()
    int some_f(func_t myFunc)
    void *obj_a

cdef int fun(void *obj):
    print(<object>obj)
    return len(<object>obj)

cdef int wrap_fun():
    global obj_a
    return fun(obj_a)

cdef func_t create_fun(obj):
    global obj_a
    obj_a=<void *>obj
    return &wrap_fun


def doit(s):
    cdef func_t fun = create_fun(s)
    print(some_f(fun))

With the expected behavior:

>>> doit('A')
A
1

create_fun is just convenience, which sets the global object and return the corresponding wrapper around the original function fun.

NB: It would be safer to make obj_a a Python-object, because void * could become dangling - but to keep the code nearer to versions 1 and 4 we use void * instead of object.

But what if there are more than one closure in use at the same time, let's say 2? Obviously with the approach above we need 2 global objects and two wrapper functions to achieve our goal:

#version 3: two function pointers at the same time
%%cython 
cdef extern from *:
    """
    typedef int (*func_t)();
    static int some_f(func_t fun){
        return fun();
    }
    static void *obj_a=NULL;
    static void *obj_b=NULL;
    """
    ctypedef int (*func_t)()
    int some_f(func_t myFunc)
    void *obj_a
    void *obj_b

cdef int fun(void *obj):
    print(<object>obj)
    return len(<object>obj)

cdef int wrap_fun_a():
    global obj_a
    return fun(obj_a)

cdef int wrap_fun_b():
    global obj_b
    return fun(obj_b)

cdef func_t create_fun(obj) except NULL:
    global obj_a, obj_b
    if obj_a == NULL:
        obj_a=<void *>obj
        return &wrap_fun_a
    if obj_b == NULL:
        obj_b=<void *>obj
        return &wrap_fun_b
    raise Exception("Not enough slots")

cdef void delete_fun(func_t fun):
    global obj_a, obj_b
    if fun == &wrap_fun_a:
        obj_a=NULL
    if fun == &wrap_fun_b:
        obj_b=NULL

def doit(s):
    ss = s+s
    cdef func_t fun1 = create_fun(s)
    cdef func_t fun2 = create_fun(ss)
    print(some_f(fun2))
    print(some_f(fun1))
    delete_fun(fun1)
    delete_fun(fun2)

After compiling, as expected:

>>> doit('A')
AA
2
A
1    

But what if we have to provide an arbitrary number of function-pointers at the same time?

The problem is, that we need to create the wrapper-functions at the run time, because there is no way to know how many we will need while compiling, so the only thing I can think of is to jit-compile these wrapper-functions when they are needed.

The wrapper function looks quite simple, here in assembler:

wrapper_fun:
    movq address_of_params, %rdi      ; void *param is the parameter of fun
    movq address_of_fun, %rax         ; addresse of the function which should be called
    jmp  *%rax                        ;jmp instead of call because it is last operation

The addresses of params and of fun will be known at run time, so we just have to link - replace the placeholder in the resulting machine code.

In my implementation I'm following more or less this great article: https://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-4-in-python/

#4. version: jit-compiled wrapper
%%cython   

from libc.string cimport memcpy

cdef extern from *:
    """
    typedef int (*func_t)(void);
    static int some_f(func_t fun){
        return fun();
    }
    """
    ctypedef int (*func_t)()
    int some_f(func_t myFunc)



cdef extern from "sys/mman.h":
       void *mmap(void *addr, size_t length, int prot, int flags,
                  int fd, size_t offset);    
       int munmap(void *addr, size_t length);

       int PROT_READ  #  #define PROT_READ  0x1     /* Page can be read.  */
       int PROT_WRITE #  #define PROT_WRITE 0x2     /* Page can be written.  */
       int PROT_EXEC  #  #define PROT_EXEC  0x4     /* Page can be executed.  */

       int MAP_PRIVATE    # #define MAP_PRIVATE  0x02    /* Changes are private.  */
       int MAP_ANONYMOUS  # #define MAP_ANONYMOUS  0x20    /* Don't use a file.  */


#                             |-----8-byte-placeholder ---|
blue_print =      b'\x48\xbf\x00\x00\x00\x00\x00\x00\x00\x00'  # movabs 8-byte-placeholder,%rdi
blue_print+=      b'\x48\xb8\x00\x00\x00\x00\x00\x00\x00\x00'  # movabs 8-byte-placeholder,%rax
blue_print+=      b'\xff\xe0'                                       # jmpq   *%rax ; jump to address in %rax

cdef func_t link(void *obj, void *fun_ptr) except NULL:
    cdef size_t N=len(blue_print)
    cdef char *mem=<char *>mmap(NULL, N, 
                                PROT_READ | PROT_WRITE | PROT_EXEC,
                                MAP_PRIVATE | MAP_ANONYMOUS,
                                -1,0)
    if <long long int>mem==-1:
        raise OSError("failed to allocated mmap")

    #copy blueprint:
    memcpy(mem, <char *>blue_print, N);

    #inject object address:
    memcpy(mem+2, &obj, 8);

    #inject function address:
    memcpy(mem+2+8+2, &fun_ptr, 8);

    return <func_t>(mem)


cdef int fun(void *obj):
    print(<object>obj)
    return len(<object>obj)


cdef func_t create_fun(obj) except NULL:
    return link(<void *>obj, <void *>&fun)

cdef void delete_fun(func_t fun):
    munmap(fun, len(blue_print))

def doit(s):
    ss, sss = s+s, s+s+s
    cdef func_t fun1 = create_fun(s)
    cdef func_t fun2 = create_fun(ss)   
    cdef func_t fun3 = create_fun(sss)  
    print(some_f(fun2))
    print(some_f(fun1))
    print(some_f(fun3))
    delete_fun(fun1)
    delete_fun(fun2)
    delete_fun(fun3)

And now, the expected behavior:

>>doit('A')
AA
2
A
1
AAA
3  

After looking at this, maybe there is a change the API can be changed?

1 Comment

I think this is a useful answer, mostly for parts 1 (the "standard" way of passing "unknown data", with void*) and 4 (it's good to see how you'd actually go about creating a runtime function). I think there should be a warning about "refcounting" for the void* approach - it's not uncommon for the callback and data pointer to saved for later, and you do need to be careful to make sure the object survives until then

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.