Unit 6: Machine Learning
Machine Learning refers to computer programs that are able to categorize data in order to maximize understanding of that information. Machine Learning is closely related to statistics and modeling and has a wide range of applications, from natural language processing, searching, robotics, and indexing, to other pattern recognition applications. This unit will begin by defining Machine Learning, its applications, and a number of other important terms that will be used in this unit. We will then go over the three main classes of Machine Learning: Supervised Learning, Semi-Supervised Learning, and Unsupervised Learning. You will also end up with an introductory foundation in Machine Learning that will be useful for further academic study in the field.
Completing this unit should take you approximately 26 hours.
6.1: Definition of Learning and Machine Learning
Read these slides. Machine learning is learning using methods that can be implemented in software.
Read these slides.
Read Chapters 6 and 7. Recognition of patterns occurs in search, game playing, language recognition, expert systems and rule-based systems, vision, and learning.
Read these slides.
Read these slides, which deal with huge numbers of features by feature selection, ranking, and applications to clustering.
6.2: Types of Machine Learning
Read these slides, which discuss training for artificial neural nets.
Read the following articles on types of machine learning.
- Supervised Learning
- Unsupervised Learning
- Self-organizing Map
- Adaptive Resonance Theory
- Semi-supervised Learning
- Co-training
- Maximum Likelihood
- Expectation Maximization
There are many learning methods, each having strengths and weaknesses in particular applications, for particular data sets and situations. Issues that have to be contended with include: bias (a predicted value of a learning algorithm is systematically incorrect when trained on several different data sets) and variance (variation of a predicted value for a given input when trained on different data sets), complexity of functions to be predicted, complexity of data, noisy data, missing data, etc.
Read these slides.
Watch these lectures.
Read this section, which describes joint distributions. Joint distributions from probability theory are useful for studying semi-supervised learning. Two statistical techniques that are also helpful are maximum likelihood and expectation maximization, both of which are used to estimate the parameters of statistical models.
6.3: A Practical Tool for Machine Learning
Read Chapter 8 on pages 129-136, which describes tool support for machine learning.