Problem Set 4

Problem Set 4

This is to be completed by November 16th, 2017.

Exercises

  1. Datacamp
    • Complete the lessons:
      a. Supervised Learning in R: Regression
      b. Supervised Learning in R: Classification
      c. Exploratory Data Analysis (If you did not already do so)
  2. Let $\lambda\geq 0$, $X\in \Bbb R^n\otimes \Bbb R^m$, $Y\in \Bbb R^n$, and $\beta \in \Bbb R^m$ suitably regarded as matrices.
    • Identify when $$\textrm{argmin}_\beta (X\beta-Y)^t(X\beta-Y)+\lambda \beta^t\beta$$ exists, and determine it in these cases.
    • How does the size of $\lambda$ affect the solution? When might it be desirable to set $\lambda$ to be positive?
  3. Bayesian approach to linear regression. Suppose that $\beta\sim N(0,\tau^2)$, and the distribution of $Y$ conditional on $X$ is $N(X\beta,\sigma^2I)$, i.e., $\beta$, $X$, and $Y$ are vector valued random variables. Show that, after seeing some data $D$, the MAP and mean estimates of the posterior distribution for $\beta$ correspond to solutions of the previous problem.

  4. R Lab:

    • Write a linear regression function that takes in a matrix of $x$-values and a corresponding vector of $y$-values and returns a function derived from the linear regression fit.
    • Write a function that takes in a non-negative number (the degree), a vector of $x$-values and a corresponding vector of $y$-values and returns a function derived from the polynomial regression fit.
    • Write a function that takes in a number $n$, a vector of $x$-values, and a corresponding vector of $y$-values and returns a function of the form: $$f(x)=\sum_{i=0}^n a_i \sin(ix)+b_i\cos(ix).$$
    • Generate suitable testing data for the three functions constructed above and plot the fitted functions.

Leave a Reply

Your email address will not be published. Required fields are marked *