{ "cells": [ { "cell_type": "markdown", "id": "df2af555-02f9-445b-99e0-ef2c59880cb3", "metadata": {}, "source": [ "# ML2: From Univariate Linear Regression to Neural Networks" ] }, { "cell_type": "code", "execution_count": 265, "id": "c775062c-7951-4ae3-b181-2ae3729e2c86", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "7c746d65-35b1-490e-914d-850797c405b5", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Goals\n", "* Know how to generalize a linear regression or linear classification model to:\n", " * predict multiple labels\n", " * take multiple features\n", "* Know the meaning and purpose of an **activation function**\n", " * Know about a few possible activation functions (tanh, relu, some relu variants) \n", "* Know how to build a **neural network** by stacking linear layers together with activation functions in between" ] }, { "cell_type": "markdown", "id": "2aa94637-546d-420f-98a6-c0c247c2dd2f", "metadata": {}, "source": [ "## Outline\n", "\n", "* Generalizing Linear regression:\n", " * What if I want to predict multiple labels?\n", " * What if I have multiple input features?\n", " * y = Wx + b\n", " * the bias trick, briefly\n", "* Multiclass classification using softmax:\n", "$$\n", "\\sigma(\\vec{x})_i = \\frac{e^{x_i}}{\\sum_j e^{z_j}}\n", "$$\n", " * Converts an arbitrary vector of un-normalized \"scores\" into a vector of probabilities (i.e., elements are $\\ge 0$ and sum to 1).\n", "* Nonlinear data: need nonlinear functions\n", " * [playground](https://playground.tensorflow.org/#activation=sigmoid&batchSize=10&dataset=xor®Dataset=reg-plane&learningRate=0.03®ularizationRate=0&noise=0&networkShape=&seed=0.32816&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false&showTestData_hide=true&activation_hide=false®ularization_hide=true&batchSize_hide=false®ularizationRate_hide=true&percTrainData_hide=true&problem_hide=false)\n", "\n", "* Idea: stack multiple linear functions in a row $\\hat{y} = W_2(W_1x)$\n", "* Problem: $W_2W_1 = W'$; this is still linear\n", "* Idea: put something nonlinear in between.\n", " * Activation functions: [zoo](https://ut.philkr.net/deeplearning/pdf/deep_networks/activation_functions.pdf)\n", " * Common choice - ReLU: $ReLU(z) = \\max(z, 0)$\n", "* Terminology: hidden layer, neuron" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 5 }