1

I need to use a C library that gives me a function that takes as input a callback function. This callback function in turn takes an array and returns a value. So for example

double candidate(double[] x);

would be a valid callback.

I want to use Cython to implement a callback function, using Numpy to simplify the implementation.

So I am trying to implement a function

cdef double cythonCandidate(double[] x):

and now I would like to "cast" x as a numpy array immediately and then do operations using numpy.

For example, I might want to write something like:

cdef double euclideanNorm(double[] x):
    # cast x into a numpy array nx here - dont know how!!
    return np.sum(x * x)

Q1. How do I do this? How do I cast a C array into a numpy array without explicit copying, but just referencing the underlying buffer?

Q2: Is there python overhead in using numpy like I intend to?

1 Answer 1

5

For Q1:

%%cython -f
import numpy as np

def test_cast():
    cdef double *x = [1, 2, 3, 4, 5]
    cdef double[:1] x_view = <double[:5]>x  # cast to memoryview, refer to the underlying buffer without copy
    xarr = np.asarray(x_view)                # numpy array refer to the underlying buffer without copy
    x_view[0] = 100
    xarr[1] = 200
    x[2] = 300
    print(xarr.flags)                       # OWNDATA flag should be False
    return x[0],x[1],x[2],x[3],x[4]         # (100.0, 200.0, 300.0, 4.0, 5.0)

Note: If you don't declare the x_view and do this xarr = np.asarray(<double[:5]>x), the cython compiler may crash with the error message:AttributeError: 'CythonScope' object has no attribute 'viewscope'.This can be fixed by from cython cimport view, for example:

%%cython -f
from cython cimport view  # comment this line to see what will happen
import numpy as np

def test_error_cast():
    cdef double *x = [1, 2, 3, 4, 5]
    xarr = np.asarray(<double[:5]>x)
    xarr[0] = 200
    return x[0],x[1],x[2],x[3],x[4]

I don't know whether it's a feature or bug.

For Q2: The numpy overhead shoud be significant when the array is small.See the benchmark below.

%%cython -a
from cython cimport view
import numpy as np

cdef inline double euclideanNorm(double *x, size_t x_size):
    xarr = np.asarray(<double[:x_size]>x)
    return np.sum(xarr*xarr)

cdef inline double euclideanNorm_c(double *x, size_t x_size):
    cdef double ss = 0.0
    cdef size_t i 
    for i in range(x_size):
        ss += x[i] * x[i]
    return ss

def c_norm(double[::1] x):
    return euclideanNorm_c(&x[0], x.shape[0])

def np_norm(double[::1] x):
    return euclideanNorm(&x[0], x.shape[0])

Small array in my PC:

import numpy as np
small_arr = np.random.rand(100)
print(c_norm(small_arr))
print(np_norm(small_arr))
%timeit c_norm(small_arr)   # 1000000 loops, best of 3: 864 ns per loop
%timeit np_norm(small_arr)  # 100000 loops, best of 3: 8.51 µs per loop

Big array in my PC:

big_arr = np.random.rand(1000000)
print(c_norm(big_arr))
print(np_norm(big_arr))
%timeit c_norm(big_arr)   # 1000 loops, best of 3: 1.46 ms per loop
%timeit np_norm(big_arr)  # 100 loops, best of 3: 4.93 ms per loop
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.