This video by Udacity summarizes Supervised vs Unsupervised Learning very nicely.
In a nutshell (using clustering as an example for instance):
- Supervised Learning has label. e.g. Spam email filtering. We have 100 emails filled with words. We label each email SPAM (Junk email) or HAM (email that worth reading) up-front. We apply algorithms (such as Naive Bayes) to build a model that will tell us, given a new email, how probable that the email is a SPAM.
- Unsupervised Learning has NO label. The algorithm will try and label it for you. (e.g. divid the 100 emials into two cluster. The SPAM cluster may share some common characteristics. The HAM cluster may have some other characteristics. (e.g. K-means Clustering, Expectation Maximization Clustering, Spectral Clustering).
I’ve recently completed the Artificial Intelligence (AI) Online Lectures taught by professor Patrick Henry Winstonby of MIT. The entire series contains 23 pre-recorded YouTube videos ( around 50 minutes long per video). The lecture series covers Artificial Intelligence in general and some techniques, such as the Goal Tree, Search, Genetic Algorithm, Neural Net, Probabilistic Inference, Nearest Neighbour, Support Vector Machine (SVM), etc.
In order to really gain the intuition of what techniques are good for what, I believe a good next step would be to try working through some real machine learning problems.
Kaggle data science competitions seem a good place to start. The journey is getting more and more exciting…