Udacity – Intro to Artificial Intelligence – Unsupervised Learning – K-Means Clustering

A quick memory refresh on understanding how K-means work in Unsupervised Learning:

Pick how many clusters we wish to find. K = 2 means “finding two clusters”. K = 3 means “finding three clusters”. etc.

Algorithm (assumd K=2 for simplicity sake).

  1. Assign two ramdom cluster center points: A and B.
  2. Link up the A and B with a straight line. Pick the mid-point. Draw a perpendicular line. This is the dividing line between cluster A and cluster B.
  3. Now our goal is to find a revised A and B locations. Do this by picking an A location that has the minimum straight-line distances between all the points in cluster A. Do the same for B. i.e. Now we have the new A and B locations.
  4. Repeat step 2 and step 3 iteratively. Until the A and B locations are more or less stable. i.e. we have two defined clusters.

This short video by Udacity summarizes this very nicely:

A Scientific Programming Sketchbook