I'm trying to figure out a procedure to perform clustering on a set of data with 52 dimensions. This is purely for my own learning so I have a data set of known fields. The data is from retrosheet.org Gamelogs using the World Series data set. I'm attempting to use only columns 25-77, so only the integers, ignoring the string data.
This is my first attempt at unsupervised learning and while I understand the concepts, I'm struggling to implement a solution in Python. I've been using scipy and numpy. If anyone knows a good place to start or some suggestions on tackling this problem, I'd appreciate it.