Mathematics for Machine Learning: PCA (MOOC)
MOOC (Coursera)
Overview
Learning material for a MOOC called “Mathematics for Machine Learning: PCA” on Coursera. I’m making this material available because believe that open-access learning is a good thing. However, if you are interested in getting a certificate, you will need to take the course on Coursera. Principal Component Analysis (PCA) is one of the most important dimensionality reduction algorithms in machine learning. In this course, we lay the mathematical foundations to derive and understand PCA from a geometric point of view.
Pre-requisites
This course is of intermediate difficulty and will require a good understanding of linear algebra as well as good Python and numpy knowledge for the tutorials.
Week 1: Statistics of Datasets
In this week, we learn how to summarize datasets (e.g., images) using basic statistics, such as the mean and the variance. We also look at properties of the mean and the variance when we shift or scale the original data set. We will provide mathematical intuition as well as the skills to derive the results. We will also implement our results in code (jupyter notebooks), which will allow us to practice our mathematical understand to compute averages of image data sets.
Learning material
- Videos (playlist on YouTube)
- Tutorial (jupyter notebook): Statistics of datasets
- Additional reading material (pdf, Section 6.4)
Week 2: Inner Products
Data can be interpreted as vectors. Vectors allow us to talk about geometric concepts, such as lengths, distances and angles to characterize similarity between vectors. This will become important later in the course when we discuss PCA. In this week, we will introduce and practice the concept of an inner product. Inner products allow us to talk about geometric concepts in vector spaces. More specifically, we will start with the dot product (which we may still know from school) as a special case of an inner product, and then move toward a more general concept of an inner product, which play an integral part in some areas of machine learning, such as kernel machines (this includes support vector machines and Gaussian processes).
Learning Material
- Videos (playlist on YouTube)
- Tutorial (jupyter notebook): Angles and distances between images
- Additional reading material (pdf, Section 3.1-3.7)
Week 3: Projections
In this week, we will look at orthogonal projections of vectors, which live in a high-dimensional vector space, onto lower-dimensional subspaces. This will play an important role in the next module when we derive PCA. We will start off with a geometric motivation of what an orthogonal projection is and work our way through the corresponding derivation. We will end up with a single equation that allows us to project any vector onto a lower-dimensional subspace. However, we will also understand how this equation came about. We will have a small programming tutorial with a jupyter notebook.
Learning Material
- Videos (playlist on YouTube)
- Tutorial (jupyter notebook): Orthogonal projections
- Additional reading material (pdf, Section 3.8)
Week 4: Principal Component Analysis
We can think of dimensionality reduction as a way of compressing data with some loss, similar to jpg or mp3. Principal Component Analysis (PCA) is one of the most fundamental dimensionality reduction techniques that are used in machine learning. In this week, we use the results from the first three modules of this course and derive PCA from a geometric point of view. Within this course, this module is the most challenging one, and we will go through an explicit derivation of PCA plus some coding exercises that will make us a proficient user of PCA.
Learning Material
- Videos (playlist on YouTube)
- Tutorial (jupyter notebook): Principal component analysis
- Additional reading material (pdf, chapter 10)
Team
- Marc Deisenroth (Lecturer)
- Yicheng Luo (Teaching Assistant)