ML2: From Univariate Linear Regression to Neural Networks¶

In [265]:

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

Goals¶

Know how to generalize a linear regression or linear classification model to:
- predict multiple labels
- take multiple features
Know the meaning and purpose of an activation function
- Know about a few possible activation functions (tanh, relu, some relu variants)
Know how to build a neural network by stacking linear layers together with activation functions in between

Outline¶

Generalizing Linear regression:
- What if I want to predict multiple labels?
- What if I have multiple input features?
- y = Wx + b
- the bias trick, briefly
Multiclass classification using softmax: $$ \sigma(\vec{x})_i = \frac{e^{x_i}}{\sum_j e^{z_j}} $$
- Converts an arbitrary vector of un-normalized "scores" into a vector of probabilities (i.e., elements are $\ge 0$ and sum to 1).
Nonlinear data: need nonlinear functions
- playground
Idea: stack multiple linear functions in a row $\hat{y} = W_2(W_1x)$
Problem: $W_2W_1 = W'$; this is still linear
Idea: put something nonlinear in between.
- Activation functions: zoo
  - Common choice - ReLU: $ReLU(z) = \max(z, 0)$
Terminology: hidden layer, neuron