Problem Set 11

This is to be completed by January 18th, 2018.

Exercises

Datacamp
- Complete the lesson:
  a. Intermediate Python for Data Science
What is the maximum depth of a decision tree trained on $N$ samples?
If we train a decision tree to an arbitrary depth, what will be the training error?
How can we alter a loss function to help regularize a decision tree?

Python Lab
1. Construct a function which will transform a dataframe of numerical features into a dataframe of binary features of the same shape by setting the value of the jth feature of the ith sample to be true precisely when the value is greater than or equal to the median value of that feature.
2. Construct a function which when presented with a dataframe of binary features, labeled outputs, and a corresponding loss function and chooses the feature to split upon which will minimize the loss function. Here we assume that on each split the function will just return the mean value of the outputs.
3. Test these functions on a real world dataset (for classification) either from ISLR or from Kaggle.

Problem Set 11

Problem Set 11

Exercises

Like this:

Related

Leave a Reply Cancel reply

Problem Set 11

Exercises

Share this:

Like this:

Related

Leave a Reply Cancel reply