Problem Set 5

Problem Set 5

This is to be completed by November 23rd, 2017.

Exercises

1. Datacamp
• Complete the lesson:
a. Machine Learning Toolbox
2. R Lab:
• Write a function in R that will take in a vector of discrete variables and will produce the corresponding one hot encodings.
• Write a function in R that will take in a matrix $X$ of samples and a vector $Y$ of classes (in $(1,…,K)$) and produces a function which classifies a new sample according to the LDA rule (do not use R’s built in machine learning facilities).
• Do the same for QDA.
• Apply your models to the MNIST dataset for handwriting classification. There are various ways to get this dataset, but perhaps the easiest is to pull it in through the keras package. Besides having keras is useful anyway. You may need to reduce the dimension of the data and/or the number of samples to get this to work in a re