3

I now have python code to create a list of ndarrays, and these arrays are not equal length. The piece of code snippet that looks like this:

import numpy as np
from mymodule import list_size, array_length # list_size and array_length are two lists of ints, and the len(array_length) == list_size

ndarray_list = []

for i in range(list_size):
    ndarray_list.append(np.zeros(array_length[i]))

Now, I need to convert this to Cython, but do not know how. I tried to create a 2-d dynamically allocated array, like this:

import numpy as np
cimport numpy as np
from mymodule import list_size, array_length

cdef int i
ndarray_list = <double **>malloc(list_size * sizeof(double*))
for i in range(list_size):
    ndarray_list[i] = <double *>malloc(array_length[i] * sizeof(double))

However, this method only creates a double pointer in ndarray_list[i]. I cannot pass it to other functions which requires some of the ndarray method.

What should I do?

1
  • 1
    I tried to condense the two approaches in one answer, but it looks much better split in two... your approach with malloc() is orders of magnitudes faster, so you should consider the malloc()-based answer... Commented May 24, 2014 at 5:32

2 Answers 2

4

In order to pass the C double* buffer to a function that requires a numpy.ndarray you can create a temporary buffer and assign to its memory address the address of the double* array.

This malloc()-based solution is orders of magnitude faster than the other answer based on NumPy buffers. Note how to free() the inner arrays to avoid a memory leak.

import numpy as np
cimport numpy as np
from cython cimport view
from libc.stdlib cimport malloc, free

cdef int i
cdef double test
list_size = 10
ndarray_list = <double **>malloc(list_size * sizeof(double*))
array_length = <int *>malloc(list_size * sizeof(int*))
for i in range(list_size):
    array_length[i] = i+1
    ndarray_list[i] = <double *>malloc(array_length[i] * sizeof(double))
    for j in range(array_length[i]):
        ndarray_list[i][j] = j

for i in range(list_size):
    for j in range(array_length[i]):
        test = ndarray_list[i][j]

cdef view.array buff
for i in range(list_size):
    buff = <double[:array_length[i]]>ndarray_list[i]
    print np.sum(buff)

#...

for i in range(list_size):
    free(ndarray_list[i])
free(ndarray_list)
free(array_length)
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you so much Saullo Castro for your excellent answer! Just one quick question in understanding the code - why is buff.data casted to <char *>? Shouldn't it be a <double *>?
@ShawnWang That was my first attempt but I got Cannot assign type 'double *' to 'char *', then I used char *, I did not find any reference explaining why we must use char *
Thanks Saullo! This is really interesting... I tried void * and it wouldn't work either. Well, at least we got it working. Thanks again! :)
@ShawnWang I am trying to find a better way to do this... without having to call np.empty() and use this char * cast...
@ShawnWang I've found a very straightforward way to do it in a much cleaner way using Cython arrays... check the update...
|
2

You can use the object type with a NumPy-based buffer. To populate ndarray_list efficiently you only need an object buffer, but note that many calls to np.zeros() may cause some slowness:

cdef int i, list_size
cdef np.ndarray[np.int_t, ndim=1] array_length
cdef np.ndarray[object, ndim=1] ndarray_list

list_size = 10000
array_length = np.arange(list_size).astype(np.int)+1

ndarray_list = np.empty(list_size, dtype=object)
for i in range(list_size):
    ndarray_list[i] = np.zeros(array_length[i], dtype=np.float64)

To access the inner arrays efficiently, you need another 1-D buffer:

cdef np.ndarray[np.float64_t, ndim=1] inner_array
cdef double test
cdef int j

for i in range(list_size):
    inner_array = ndarray_list[i]
    for j in range(inner_array.shape[0]):
        test = inner_array[j]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.