1

I have a list of numbers in an array. The index of each element is X and the value is Y. How do i go about partitioning/clustering this data? If i had an array, i just want a set of values which mark the end of each partition. Since I'm working on Python, please do mention if there are libraries to do the same.

Thanks.

2
  • What's the data? What's your application? Are you sure you want clustering rather than segmenting? i.e. Do you want all points in a cluster to be contiguous X samples? This is what you'd usually do for a time series. Commented May 27, 2011 at 6:53
  • possible duplicate of not random clusters in 1D data set Commented Feb 1, 2013 at 7:42

1 Answer 1

5

K-Means is a very simple clustering algorithm, I would say the first to test before going for more complex things. The K-Means algorithm http://en.wikipedia.org/wiki/K-means_clustering

Proper K-Means initialization is strongly advised http://en.wikipedia.org/wiki/K-means%2B%2B, as it.

If you're not happy with K-Means, then you use EM algorithm with Gaussian mix ( http://en.wikipedia.org/wiki/Mixture_model ), not too hard to code and you can use K-Means to initialize it !

Those have been implemented 100 times in Python, check any machine learning toolbox.

Sign up to request clarification or add additional context in comments.

2 Comments

SciPy has a very friendly implementation of kmeans in its cluster package. I was just using it today as a matter of fact, and I happen to have the docs in another tab right now: docs.scipy.org/doc/scipy/reference/cluster.vq.html
Don't use k-means on 1-d data. Use optmized 1-d techniques.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.