I'm doing a kernel density estimation of a dataset (a collection of points).
The estimation process is ok, the problem is that, when I'm trying to get the density value for each point, the speed is very slow:
from sklearn.neighbors import KernelDensity
# this speed is ok
kde = KernelDensity(bandwidth=2.0,atol=0.0005,rtol=0.01).fit(sample)
# this is very slow
kde_result = kde.score_samples(sample)
The sample is consist of 300,000 (x,y) points.
I'm wondering if it's possible to make it run parallely, so the speed would be quicker?
For example, maybe I can divide the sample in to smaller sets and run the score_samples for each set at the same time? Specifically:
- I'm not familliar with
parallel computingat all. So I'm wondering if it's applicable in my case? - If this can really speed up the process, what should I do? I'm just running the script in
ipython notebook, and have no prior expereince in this, is there any good and simple example for my case?
I'm reading http://ipython.org/ipython-doc/dev/parallel/parallel_intro.html now.
UPDATE:
import cProfile
cProfile.run('kde.score_samples(sample)')
64 function calls in 8.653 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 8.653 8.653 <string>:1(<module>)
2 0.000 0.000 0.000 0.000 _methods.py:31(_sum)
2 0.000 0.000 0.000 0.000 base.py:870(isspmatrix)
1 0.000 0.000 8.653 8.653 kde.py:133(score_samples)
4 0.000 0.000 0.000 0.000 numeric.py:464(asanyarray)
2 0.000 0.000 0.000 0.000 shape_base.py:60(atleast_2d)
2 0.000 0.000 0.000 0.000 validation.py:105(_num_samples)
2 0.000 0.000 0.000 0.000 validation.py:126(_shape_repr)
6 0.000 0.000 0.000 0.000 validation.py:153(<genexpr>)
2 0.000 0.000 0.000 0.000 validation.py:268(check_array)
2 0.000 0.000 0.000 0.000 validation.py:43(_assert_all_finite)
6 0.000 0.000 0.000 0.000 {hasattr}
4 0.000 0.000 0.000 0.000 {isinstance}
12 0.000 0.000 0.000 0.000 {len}
2 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
1 8.652 8.652 8.652 8.652 {method 'kernel_density' of 'sklearn.neighbors.kd_tree.BinaryTree' objects}
2 0.000 0.000 0.000 0.000 {method 'reduce' of 'numpy.ufunc' objects}
2 0.000 0.000 0.000 0.000 {method 'sum' of 'numpy.ndarray' objects}
6 0.000 0.000 0.000 0.000 {numpy.core.multiarray.array}
linear, still very slow onkde_result = kde.score_samples(sample)kernel_density. I'm not aware of a parallel implementation of this, so it looks you're going to have to start from scratch. I found this post that might help you get started and edited the question so a dev from scikit can give you a better insight.