{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "df2af555-02f9-445b-99e0-ef2c59880cb3",
   "metadata": {},
   "source": [
    "# ML2: From Univariate Linear Regression to Neural Networks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 265,
   "id": "c775062c-7951-4ae3-b181-2ae3729e2c86",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7c746d65-35b1-490e-914d-850797c405b5",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "#### Goals\n",
    "* Know how to generalize a linear regression or linear classification model to:\n",
    "    * predict multiple labels\n",
    "    * take multiple features\n",
    "* Know the meaning and purpose of an **activation function**\n",
    "    * Know about a few possible activation functions (tanh, relu, some relu variants) \n",
    "* Know how to build a **neural network** by stacking linear layers together with activation functions in between"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2aa94637-546d-420f-98a6-c0c247c2dd2f",
   "metadata": {},
   "source": [
    "## Outline\n",
    "\n",
    "* Generalizing Linear regression:\n",
    "    * What if I want to predict multiple labels?\n",
    "    * What if I have multiple input features?\n",
    "    * y = Wx + b\n",
    "    * the bias trick, briefly\n",
    "* Multiclass classification using softmax:\n",
    "$$\n",
    "\\sigma(\\vec{x})_i = \\frac{e^{x_i}}{\\sum_j e^{z_j}}\n",
    "$$\n",
    "    * Converts an arbitrary vector of un-normalized \"scores\" into a vector of probabilities (i.e., elements are $\\ge 0$ and sum to 1).\n",
    "* Nonlinear data: need nonlinear functions\n",
    "  * [playground](https://playground.tensorflow.org/#activation=sigmoid&batchSize=10&dataset=xor&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=&seed=0.32816&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false&showTestData_hide=true&activation_hide=false&regularization_hide=true&batchSize_hide=false&regularizationRate_hide=true&percTrainData_hide=true&problem_hide=false)\n",
    "\n",
    "* Idea: stack multiple linear functions in a row $\\hat{y} = W_2(W_1x)$\n",
    "* Problem: $W_2W_1 = W'$; this is still linear\n",
    "* Idea: put something nonlinear in between.\n",
    "  * Activation functions: [zoo](https://ut.philkr.net/deeplearning/pdf/deep_networks/activation_functions.pdf)\n",
    "       * Common choice - ReLU: $ReLU(z) = \\max(z, 0)$\n",
    "* Terminology: hidden layer, neuron"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}