# Problem Set 4

## Problem Set 4

This is to be completed by November 16th, 2017.

### Exercises

1. Datacamp
• Complete the lessons:
a. Supervised Learning in R: Regression
b. Supervised Learning in R: Classification
c. Exploratory Data Analysis (If you did not already do so)
2. Let $\lambda\geq 0$, $X\in \Bbb R^n\otimes \Bbb R^m$, $Y\in \Bbb R^n$, and $\beta \in \Bbb R^m$ suitably regarded as matrices.
• Identify when $$\textrm{argmin}_\beta (X\beta-Y)^t(X\beta-Y)+\lambda \beta^t\beta$$ exists, and determine it in these cases.
• How does the size of $\lambda$ affect the solution? When might it be desirable to set $\lambda$ to be positive?
3. Bayesian approach to linear regression. Suppose that $\beta\sim N(0,\tau^2)$, and the distribution of $Y$ conditional on $X$ is $N(X\beta,\sigma^2I)$, i.e., $\beta$, $X$, and $Y$ are vector valued random variables. Show that, after seeing some data $D$, the MAP and mean estimates of the posterior distribution for $\beta$ correspond to solutions of the previous problem.

4. R Lab:

• Write a linear regression function that takes in a matrix of $x$-values and a corresponding vector of $y$-values and returns a function derived from the linear regression fit.
• Write a function that takes in a non-negative number (the degree), a vector of $x$-values and a corresponding vector of $y$-values and returns a function derived from the polynomial regression fit.
• Write a function that takes in a number $n$, a vector of $x$-values, and a corresponding vector of $y$-values and returns a function of the form: $$f(x)=\sum_{i=0}^n a_i \sin(ix)+b_i\cos(ix).$$
• Generate suitable testing data for the three functions constructed above and plot the fitted functions.