This video by Udacity summarizes Supervised vs Unsupervised Learning very nicely.
In a nutshell (using clustering as an example for instance):
- Supervised Learning has label. e.g. Spam email filtering. We have 100 emails filled with words. We label each email SPAM (Junk email) or HAM (email that worth reading) up-front. We apply algorithms (such as Naive Bayes) to build a model that will tell us, given a new email, how probable that the email is a SPAM.
- Unsupervised Learning has NO label. The algorithm will try and label it for you. (e.g. divid the 100 emials into two cluster. The SPAM cluster may share some common characteristics. The HAM cluster may have some other characteristics. (e.g. K-means Clustering, Expectation Maximization Clustering, Spectral Clustering).
The Expectation Maximization is somewhat similar to K-means, with this core difference:
In the corresponding step:
- k-means uses “hard correspondence” – estimated centerpoint A only compares with the data points in cluster A in the revision of new estimated centerpoint A location. It does not compare with data points from other clusters (e.g. cluster B, etc.)
- Expectation Maximization uses “soft correspondence” – estimated centerpoint A compares with the data points in cluster A and other clusters in the revision of new estimated centerpoint A location. It does compare with data points from other clusters (e.g. cluster B, etc.).
This video by Udacity summarizes this very nicely.