What clustering algorithm to use on 1-d data? [closed]

Question

I have a list of numbers in an array. The index of each element is X and the value is Y. How do i go about partitioning/clustering this data? If i had an array, i just want a set of values which mark the end of each partition. Since I'm working on Python, please do mention if there are libraries to do the same.

Thanks.

What's the data? What's your application? Are you sure you want clustering rather than segmenting? i.e. Do you want all points in a cluster to be contiguous X samples? This is what you'd usually do for a time series. — dimatura
– dimatura, Commented May 27, 2011 at 6:53

Monkey · Accepted Answer · 2011-05-27 03:19:40Z

5

K-Means is a very simple clustering algorithm, I would say the first to test before going for more complex things. The K-Means algorithm http://en.wikipedia.org/wiki/K-means_clustering

Proper K-Means initialization is strongly advised http://en.wikipedia.org/wiki/K-means%2B%2B, as it.

If you're not happy with K-Means, then you use EM algorithm with Gaussian mix ( http://en.wikipedia.org/wiki/Mixture_model ), not too hard to code and you can use K-Means to initialize it !

Those have been implemented 100 times in Python, check any machine learning toolbox.

answered May 27, 2011 at 3:19

Monkey

1,8661 gold badge17 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

jscs Over a year ago

SciPy has a very friendly implementation of kmeans in its cluster package. I was just using it today as a matter of fact, and I happen to have the docs in another tab right now: docs.scipy.org/doc/scipy/reference/cluster.vq.html

Has QUIT--Anony-Mousse Over a year ago

Don't use k-means on 1-d data. Use optmized 1-d techniques.

Collectives™ on Stack Overflow

What clustering algorithm to use on 1-d data? [closed]

1 Answer 1

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Linked

Related