3

I am trying to wrap some c code with cython. The place I am stuck is one function defined in a file called "algo.c":

int update_emission_mat(
// input
int* count_data, 
double *bound_state_data_pmf, double *unbound_state_data_pmf,
int silent_states_begin, int nucleosome_present, 
int nucleosome_start, int n_padding_states,
// output matrix and its dimensions
double *emission_mat, int n_obs, int n_states
) {......}

The c code is correct, as it's used and tested before. Then in "algo.h" I declared the same function:

int update_emission_mat(
// input
int *count_data, 
double *bound_state_data_pmf, double *unbound_state_data_pmf,
int silent_states_begin, int nucleosome_present, 
int nucleosome_start, int n_padding_states,
// output matrix and its dimensions
double *emission_mat, int obs_len, int n_states
);

Also, to wrap the function into cython, I have "algo.pxd" with the following in the file:

cdef extern from "algo.h":
    ...
    int update_emission_mat(
        # input
        int *count_data, 
        double *bound_state_data_pmf, double *unbound_state_data_pmf,
        int silent_states_begin, int nucleosome_present, 
        int nucleosome_start, int n_padding_states,
        # output matrix and its dimensions
        double *emission_mat, int obs_len, int n_states
    )

Then finally, in the main cython file "main.pyx", I defined a class:

cimport algo
import numpy as np
cimport numpy as np
import cython
cdef class main:
    ... 
    cdef np.ndarray data_emission_matrix
    cdef np.ndarray count_data
    ...
    # in one function of the class, I defined and initialized data_emission_matrix
    cpdef alloc_space(self):
        ...
        cdef np.ndarray[np.double_t, ndim = 2] data_emission_matrix = np.ones((self.n_obs, self.n_states), dtype = np.float64, order = 'C')
        self.data_emission_matrix = data_emission_matrix
        ...

    # in another function, I defined and initialized count_data
    cpdef read_counts_data(self, data_file):
        df = open(data_file, 'r') # data_file only contains a column of integers
        dfc = df.readlines()
        df.close()
        cdef np.ndarray[np.int_t, ndim = 1] count_data = np.array(dfc, dtype = np.int, order = 'C')    
        self.count_data = count_data

    # finally, called the c function
    cpdef update_data_emission_matrix_using_counts_data(self):
        ....

        cdef np.ndarray[np.int, ndim = 1] count_data = self.count_data
        cdef np.ndarray[np.double_t, ndim = 2] data_emission_matrix = \
            self.data_emission_matrix

        cdef int n_padding_states = 5
        algo.update_emission_mat(
            &count_data[0], &bound_state_data_pmf[0], 
            &unbound_state_data_pmf[0], self.silent_states_begin,
            self.nucleosome_present, self.nucleosome_start, 
            n_padding_states, &data_emission_matrix[0,0],
            self.n_obs, self.n_states
            )

I could not compile the file. The error message I got is complaining about taking the address of "count_data":

Error compiling Cython file:
------------------------------------------------------------
...
        self.data_emission_matrix

    cdef int n_padding_states = 5

    algo.update_emission_mat(
        &count_data[0], &bound_state_data_pmf[0],
       ^
------------------------------------------------------------

main.pyx:138:12: Cannot take address of Python variable

I am confused because I essentially treated "data_emission_matrix" the same way but cython is not complaining about that. I apologize for the tedious code. I am fairly new to cython and could not figure out the exact spot that caused the error... I appreciate any help!

1 Answer 1

2

use following code to pass the address of data buffer, if you ensure the ndarray is C_CONTIGUOUS:

<int *>count_data.data

edit:

The real problem that cause the error is the element type of count_data, it should be: np.int_t.

Sign up to request clarification or add additional context in comments.

3 Comments

Isn't this approach not recommended, as per github.com/cython/cython/wiki/tutorials-NumpyPointerToC ?
You should use np.int_t, not np.int in cdef np.ndarray[np.int_t, ndim = 1].
I tried both. When I used np.int_t, I got error Cannot assign type 'int_t *' to 'int *'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.