Passing numpy integer array to c code

Question

I'm trying to write Cython code to dump a dense feature matrix, target vector pair to libsvm format faster than sklearn's built in code. I get a compilation error complaining about a type issue with passing the target vector (a numpy array of ints) to the relevant c function.

Here's the code:

import numpy as np
cimport numpy as np
cimport cython

cdef extern from "cdump.h":
    int filedump( double features[], int numexemplars, int numfeats, int target[], char* outfname)

@cython.boundscheck(False)
@cython.wraparound(False)
def fastdumpdense_libsvmformat(np.ndarray[np.double_t,ndim=2] X, y, outfname):
    if X.shape[0] != len(y):
        raise ValueError("X and y need to have the same number of points")

    cdef int numexemplars = X.shape[0]
    cdef int numfeats = X.shape[1]

    cdef bytes py_bytes = outfname.encode()
    cdef char* outfnamestr = py_bytes

    cdef np.ndarray[np.double_t, ndim=2, mode="c"] X_c
    cdef np.ndarray[np.int_t, ndim=1, mode="c"] y_c
    X_c = np.ascontiguousarray(X, dtype=np.double)
    y_c = np.ascontiguousarray(y, dtype=np.int)
    retval = filedump( &X_c[0,0], numexemplars, numfeats, &y_c[0], outfnamestr)

    return retval

When I attempt to compile this code using distutils, I get the error

cythoning fastdump_svm.pyx to fastdump_svm.cpp

Error compiling Cython file:
------------------------------------------------------------ ...

    cdef np.ndarray[np.double_t, ndim=2, mode="c"] X_c
    cdef np.ndarray[np.int_t, ndim=1, mode="c"] y_c
    X_c = np.ascontiguousarray(X, dtype=np.double)
    y_c = np.ascontiguousarray(y, dtype=np.int)
    retval = filedump( &X_c[0,0], numexemplars, numfeats, &y_c[0], outfnamestr)
                                                         ^
------------------------------------------------------------

fastdump_svm.pyx:24:58: Cannot assign type 'int_t *' to 'int *'

Any idea how to fix this error? I originally was following the paradigm of passing y_c.data, which works, but this is apparently not the recommended way.

ekipmanager · Accepted Answer · 2015-02-20 00:53:29Z

4

You can also use dtype=np.dtype("i") when initiating a numpy array to match the C int on your machine.

cdef int [:] y_c
c_array = np.ascontiguousarray(y, dtype=np.dtype("i"))

answered Feb 20, 2015 at 0:53

ekipmanager

1164 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Bi Rico · Accepted Answer · 2014-05-02 22:51:36Z

3

The problem is that numpy.int_t is not the same as int, you can easily check this by having your program print sizeof(numpy.int_t) and sizeof(int).

int is a c int, defined by the c standard as being at least 16 bits, but it's 32 bits on my machine. numpy.int_t is usually 32 bits or 64 bits depending on whether you're using a 32 or 64 bit version of numpy, but of course there is some exception (probably for windows users). If you want to know which numpy dtype matches your c_int you can do np.dtype(cytpes.c_int).

So to pass your numpy array to c code you can do:

import ctypes
cdef np.ndarray[int, ndim=1, mode="c"] y_c
y_c = np.ascontiguousarray(y, dtype=ctypes.c_int)
retval = filedump( &X_c[0,0], numexemplars, numfeats, &y_c[0], outfnamestr)

edited May 2, 2014 at 22:51

answered May 2, 2014 at 22:44

Bi Rico

25.9k3 gold badges57 silver badges75 bronze badges

2 Comments

AatG Over a year ago

The dtype in the cdef should be int_t, no? When I try it without int_t I get an error about not being able to cast pointers to Python objects. When I use int_t, I get the same error as before about casting from int_t * to int *.

Bi Rico Over a year ago

Sorry about my last comment, I seem to have misunderstood your followup question. The cdef should have the same type as the function declaration of filedump so if the argument is defined as int target[], the cdef should use int. If you can change the signature of filedump you can set them both to np.int_t for example, but they should be the same. Make sure you're using int and not np.int, the first is a c basic type (at least when you use it in a cdef block) and the latter is a python type.

Collectives™ on Stack Overflow

Passing numpy integer array to c code

2 Answers 2

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related