I was wondering if I'm missing something when using Cython with Numpy because I haven't seen much of an improvement. I wrote this code as an example.
Naive version:
import numpy as np
from skimage.util import view_as_windows
it = 16
arr = np.arange(1000*1000, dtype=np.float64).reshape(1000,1000)
windows = view_as_windows(arr, (it, it), it)
container = np.zeros((windows.shape[0], windows.shape[1]))
def test(windows):
for i in range(windows.shape[0]):
for j in range(windows.shape[1]):
container[i,j] = np.mean(windows[i,j])
return container
%%timeit
test(windows)
1 loops, best of 3: 131 ms per loop
Cythonized version:
%%cython --annotate
import numpy as np
cimport numpy as np
from skimage.util import view_as_windows
import cython
cdef int step = 16
arr = np.arange(1000*1000, dtype=np.float64).reshape(1000,1000)
windows = view_as_windows(arr, (step, step), step)
@cython.boundscheck(False)
def cython_test(np.ndarray[np.float64_t, ndim=4] windows):
cdef np.ndarray[np.float64_t, ndim=2] container = np.zeros((windows.shape[0], windows.shape[1]),dtype=np.float64)
cdef int i, j
I = windows.shape[0]
J = windows.shape[1]
for i in range(I):
for j in range(J):
container[i,j] = np.mean(windows[i,j])
return container
%timeit cython_test(windows)
10 loops, best of 3: 126 ms per loop
As you can see, there is a very modest improvement, so maybe I'm doing something wrong. By the way, the annotation that Cython produces the following:

As you can see, the numpy lines have a yellow background even after including the efficient indexing syntax np.ndarray[DTYPE_t, ndim=2]. Why?
By the way, in my view the ideal outcome is being able to use most numpy functions but still get some reasonable improvement after taking advantage of efficient indexing syntax or maybe memory views as in HYRY's answer.
UPDATE
It seems I'm not doing anything wrong in the code I posted above and that the yellow background in some lines is normal, so I was left wondering the following: In which situations I can get a benefit from typing cdef np.ndarray[np.float64_t, ndim=2] in front of numpy arrays? I suppose there are specific instances where this is helpful, otherwise there wouldn't be much purpose in doing it.
np.float64_tinstead. Is not the same?np.meancall (do it direct).np.nditercan be useful to expose the inner loop to Cython.