ML2: From Univariate Linear Regression to Neural Networks¶

In [265]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

Goals¶

  • Know how to generalize a linear regression or linear classification model to:
    • predict multiple labels
    • take multiple features
  • Know the meaning and purpose of an activation function
    • Know about a few possible activation functions (tanh, relu, some relu variants)
  • Know how to build a neural network by stacking linear layers together with activation functions in between

Outline¶

  • Generalizing Linear regression:

    • What if I want to predict multiple labels?
    • What if I have multiple input features?
    • y = Wx + b
    • the bias trick, briefly
  • Multiclass classification using softmax: $$ \sigma(\vec{x})_i = \frac{e^{x_i}}{\sum_j e^{z_j}} $$

    • Converts an arbitrary vector of un-normalized "scores" into a vector of probabilities (i.e., elements are $\ge 0$ and sum to 1).
  • Nonlinear data: need nonlinear functions

    • playground
  • Idea: stack multiple linear functions in a row $\hat{y} = W_2(W_1x)$

  • Problem: $W_2W_1 = W'$; this is still linear

  • Idea: put something nonlinear in between.

    • Activation functions: zoo
      • Common choice - ReLU: $ReLU(z) = \max(z, 0)$
  • Terminology: hidden layer, neuron