Supervised Learning

A big computer, a complex algorithm, and a long time does not equal science. – Robert Gentleman

Examples

Before getting into what supervised learning precisely is, let’s look at some examples of supervised learning tasks:

Identifying breast cancer.
- A sample study.
Image classification.
- List of last year’s ILSVRC Winners
Threat assessment
Language Translation
- NYTimes: Google’s neural machine translation
Identifying faces in images

Definition

Supervised learning is concerned with the construction of machine learning algorithms $f\colon 2^{D\times T}\times D\to T$. The subsets of $D\times T$ are referred to as subsets of labeled examples.

We often assume that the labeled examples arise from restricting some presumed function $F\colon D\to T$ whose values we know on $E\subset D$. We can then train $f$ on pairs $\{s, F(s) | s\in E\}$. We can then devise a cost function which measures the distance from the learned $f$ and the presumed $F$ (e.g., $L^2$-distance).

Supervised learning has been the most successful of the three branches to real world problems. The existence of labeled examples usually leads to well-defined performance metrics that transform supervised learning tasks into two other tasks finding an appropriate parametrized class of functions to choose $f$ from and an optimization problem of finding the best function in that class.

Typically $D$ will be some subset of $\Bbb R^n$ and we will refer to the components of $\Bbb R^n$ as features, dependent variables, or attributes (many concepts in machine learning have many names). Sometimes $D$ will be discrete, in which case we refer to these as categorical variables. There are a few simple tricks to map categorical variables into $\Bbb R^n$ (such as one-hot encoding), so it usually does not hurt to think of $D$ as a subset of $\Bbb R^n$.

When $T$ is discrete (typically finite), then the supervised learning problem is called a classification problem. When it is continuous (e.g, $\Bbb R$) then it is called a regression problem.

Examples

Let us consider the following sequence of supervised learning methods in turn.

Linear Regression (Regression problems).
$k$-nearest Neighbors (Classification or Regression problems).
Logistic Regression (Classification problems¹).
Naive Bayes Classifier (Classification problems).
Linear Discriminant Analysis (Classification problems).
Support Vector Machines (Classification problems).
Decision trees (Primarily Classification problems).
Neural networks (Classification or Regression problems).

I know the name is confusing. ↩

Supervised Learning

Examples

Definition

Examples

Like this:

Related

Leave a Reply Cancel reply

Examples

Definition

Examples

Share this:

Like this:

Related

Leave a Reply Cancel reply