Skip to main content

Supervised Learning

Learn supervised learning algorithms including linear models, decision trees, neural networks, and more. Understand how models learn from labeled data to make predictions.

What is Supervised Learning?

Supervised learning is a type of machine learning where algorithms learn from labeled training data to make predictions or decisions. The "supervised" aspect comes from the fact that we provide the correct answers (labels) during training, allowing the model to learn the relationship between inputs and outputs.

Formal Definitions

Hypothesis, Model, and Prediction Function

A hypothesis or model or prediction function is a function h:XYh : \mathcal{X} \rightarrow \mathcal{Y} that maps from the input space X\mathcal{X} to the output space Y\mathcal{Y}.

Training Set

A training set is a set of pairs {(x(1),y(1)),...,(x(n),y(n))}\{(x^{(1)}, y^{(1)}), ..., (x^{(n)}, y^{(n)})\} such that x(i)Xx^{(i)} \in \mathcal{X} and y(i)Yy^{(i)} \in \mathcal{Y} for i{1,...,n}i \in \{1, ..., n\}.

The value nn is the training set size.

Goal of Learning

Goal: Use the training set to find (= learn) a good model hh.

  • What "good" means is not always easy to define (part of the modeling challenge).
  • We will want to use the model hh on new data, not the training set (generalization).

Problem Types

If Y\mathcal{Y} is continuous, then we call it a regression problem.

If Y\mathcal{Y} is discrete, then we call it a classification problem (binary or multi-class).

The Learning Pipeline

Topics Covered