{ "cells": [ { "cell_type": "markdown", "id": "c3109b7e-0092-47a3-aca2-288e745731ba", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## Lecture 9 - Image Alignment 2; Robust Model Fitting (RANSAC)" ] }, { "cell_type": "markdown", "id": "c0357b50-2962-423e-85f1-80217ec3d134", "metadata": {}, "source": [ "#### Announcements\n", "\n", "* Project 1:\n", " * Github classroom is trying to run the tests for you when you push, but they haven't seen the `uv` light yet. If the tests pass locally, you're in good shape! Sorry for any confusion.\n", " * Code due tonight\n", " * artifacts due tomorrow night\n", "* Project 2 is out! Due next Tuesday\n", " * Be prepared for it to be a little bigger than project 1 (mainly just more pieces)\n", "* Soliciting ideas and/or image submissions for our class sticker! I need to send something in by the end of this week.\n", "* Today's tea: Four Seasons Si Ji Chun oolong" ] }, { "cell_type": "markdown", "id": "38efd636-c09e-4d1f-a828-33e0fcc75686", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Goals\n", "* Know how to find a least-squares best-fit transformation for:\n", " * homography (with caveats)\n", "* Understand the Random Sample Consensus (RANSAC) algorithm" ] }, { "cell_type": "code", "execution_count": 5, "id": "4a295613-e6e3-4f35-b603-27b158385acc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The autoreload extension is already loaded. To reload it, use:\n", " %reload_ext autoreload\n" ] } ], "source": [ "# boilerplate setup\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "%matplotlib inline\n", "\n", "import os\n", "import sys\n", "\n", "src_path = os.path.abspath(\"../src\")\n", "if (src_path not in sys.path):\n", " sys.path.insert(0, src_path)\n", "\n", "# Library imports\n", "import numpy as np\n", "import imageio.v3 as imageio\n", "import matplotlib.pyplot as plt\n", "import skimage as skim\n", "import cv2\n", "\n", "# codebase imports\n", "import util\n", "import filtering\n", "import features\n", "import geometry" ] }, { "cell_type": "markdown", "id": "c6ed4ad7-132d-4906-8a67-fae476b90b39", "metadata": {}, "source": [ "#### Plan\n", "\n", "* Homography fitting\n", "* Outlier robustness: RANSAC\n", "* *Warping: forward and inverse*\n", "* *Bilinear interpolation*" ] }, { "cell_type": "markdown", "id": "4d795b83-42fc-467b-a5a1-c9125c68740b", "metadata": {}, "source": [ "### Context: Panorama Stitching Overview\n", "\n", "* [x] Detect features - Harris corners\n", "* [x] Describe features - MOPS descriptor\n", "* [x] Match features - SSD + ratio test\n", "* Estimate motion model from correspondences\n", " * [x] Translation\n", " * [x] Affine\n", " * [ ] **Projective**\n", " * [ ] **Robustness to outliers - RANSAC**\n", "* Warp image(s) into common coordinate system and blend\n", " * [ ] **Inverse warping**\n", " * [ ] Blending\n", " * [ ] 360 panoramas?" ] }, { "cell_type": "markdown", "id": "f511d1bf-422f-4ccd-acb1-120f9a67baf9", "metadata": {}, "source": [ "Recall our definition of the optimal transformation for a given set of correspondences is the one that **minimizes** the sum of squared residuals:\n", "$$\n", "\\min_T \\sum_i||(T\\mathbf{p}_i - \\mathbf{p}_i')||^2\n", "$$\n" ] }, { "cell_type": "markdown", "id": "d5fad59d-ddd8-4ce8-a9fc-34bf6262d9d2", "metadata": {}, "source": [ "##### Homework Problem 1\n", "\n", "Write down the $x$ and $y$ residuals for a pair of corresponding points $(x, y)$ in image 1 and $(x', y')$ in image 2 under a homography (projective) motion model. Assume the homography matrix is parameterized as\n", "$$\n", "\\begin{bmatrix}\n", " a & b & c\\\\\n", " d & e & f\\\\\n", " g & h & 1\n", "\\end{bmatrix}\n", "$$" ] }, { "cell_type": "markdown", "id": "0d87eace-1d55-4fd1-b48a-e22ae52f5a01", "metadata": {}, "source": [ "Whiteboard: \n", "* homography residuals\n", "* roadblocks\n", "* duct-tape fixes" ] }, { "cell_type": "markdown", "id": "d6becae3-caa9-4bff-805b-818d22d719d8", "metadata": {}, "source": [ "Whiteboard: solving homogeneous least squares systems:\n", "$$\n", "\\min_\\mathbf{x} ||A\\mathbf{x}||\n", "$$\n", "subject to\n", "$$\n", " ||x|| = 1 \n", " $$" ] }, { "cell_type": "markdown", "id": "5b9b159a-0158-4332-8906-23bbc6c24284", "metadata": {}, "source": [ "TL;DM (too long; didn't math):\n", "\n", "* Decompose $A$ using the SVD:\n", " $$\n", " U_{m \\times m}, \\Sigma_{m\\times n}, V^T_{n \\times n} = \\mathrm{SVD}(A_{m \\times n})\n", " $$\n", "* The optimal vector $x^* = $ the row of $V^T$ (column of $V$) corresponding to the smallest element of $\\Sigma$ (which is diagonal)\n", "* Usually your linear algebra library will order things so that $\\Sigma$'s elements are in descending order, so in practice $x^* = $ the last row of $V^T$ is the optimal $x*$" ] }, { "cell_type": "markdown", "id": "864ca289-1e19-438d-9441-383952e1f336", "metadata": {}, "source": [ "### Next up: Robustness to outliers\n", "\n", "![](../data/outliers.png)" ] }, { "cell_type": "markdown", "id": "991f0bde-34f6-4774-a3aa-4a2eedc0ac09", "metadata": {}, "source": [ "#### RANSAC: RAndom SAmple Consensus\n", "\n", "Finding a transformation is a model fitting problem. A simple model fitting problem that we'll use as analogy is **line fitting** (in fact, this is what linear least squares is doing for us, it's just fitting higher-dimensional lines).\n", "\n", "**Problem statement**, for now: Given a set of points with some outliers, find the line that fits the non-outliers best.\n", "\n", "**Key Idea:** \n", "\n", "> “All good matches are alike; every bad match is bad in its own way.”\n", "> \n", "> -Tolstoy, as misquoted by Alyosha Efros" ] }, { "cell_type": "markdown", "id": "f3137007-510a-4c82-a23a-cc65d1afecc0", "metadata": {}, "source": [ "**Observation**: If I have a candidate model, I can tell how good it is by measuring how many points \"agree\" on that model." ] }, { "cell_type": "markdown", "id": "05ee953c-b592-46f9-b564-6eef374b5152", "metadata": {}, "source": [ "Algorithm, take 1:\n", "```\n", "for every possible line:\n", " count how many points are inliers to that line\n", "return the line with the most inliers\n", "```\n", "Runtime: O($\\infty$)\n" ] }, { "cell_type": "markdown", "id": "47642649-cf61-47b4-888a-b9e48b31520b", "metadata": {}, "source": [ "Algorithm, take 2:\n", "```\n", "for every line that goes through two of the given points:\n", " count how many points are inliers to that line\n", "return the line with the most inliers\n", "```\n", "Runtime: O(n^3)" ] }, { "cell_type": "markdown", "id": "7a6f32f6-d64d-4d86-b61e-4cea04a73646", "metadata": {}, "source": [ "Algorithm, take 3: RANSAC - see whiteboard notes" ] }, { "cell_type": "markdown", "id": "40b1aa7e-86cf-4a1b-a359-82261cc270e8", "metadata": {}, "source": [ "##### Homework Problems 2-4\n", "\n", "2. In the inner loop of RANSAC, how many points are used to fit a candidate model if you are fitting a line to a set of 2D points?\n", "3. In the inner loop of RANSAC, how many pairs of corresponding points are used to fit a candidate model if you are fitting a translation to a set of correspondences?\n", "4. In the inner loop of RANSAC, how many pairs of corresponding points are used to fit a candidate model if you are fitting a homography to a set of correspondences?" ] }, { "cell_type": "markdown", "id": "ae07d56a-2f0b-413c-be6c-e657abb230e1", "metadata": {}, "source": [ "##### Homework Problem 5 (preview)\n", "\n", "In this problem, we'll analyze the RANSAC algorithm to help us understand how to decide how many iterations to run ($K$). Suppose we are fitting some model that requires a minimal set of $s$ points to fully determine (e.g., $s=4$ matches for a homography, $s=2$ points for a line). We also know (or have assumed) that the data has an *inlier ratio* of $r = \\frac{\\text{\\# inliers}}{\\text{\\# data points}}$; in other words, the probability that a randomly sampled point from the dataset as a probability of $r$ of being an inlier." ] }, { "cell_type": "markdown", "id": "283352cd-8321-4ac3-a5c3-95eb4ce6cdce", "metadata": {}, "source": [ "---\n", "*(the following is included here just in case there's time)*" ] }, { "cell_type": "markdown", "id": "71425a03-00aa-43ed-a1ab-3260ae88ba1e", "metadata": {}, "source": [ "### Warping: Forward and Inverse\n", "\n", "(See whiteboard notes.)\n", "\n", "Forward warping:\n", "```\n", "for x, y in src:\n", " x', y' = T(x, y)\n", " dst[x', y'] = src[x, y]\n", "```\n", "\n", "Inverse warping:\n", "```\n", "Tinv = inv(T)\n", "for (x', y') in dst:\n", " x, y = Tinv(x', y')\n", " dst[x', y'] = src[x, y]\n", "```" ] }, { "cell_type": "markdown", "id": "f58df44e-6bca-4ab2-984f-ff20d2828d79", "metadata": {}, "source": [ "##### Homework Problem 6\n", "\n", "5. Complete the following function with Python-esque pseudocode (or working code in the lecture codebase), which performs inverse warping with nearest-neighbor sampling in the source image.\n", "\n", "```python\n", "def warp(img, tx, dsize=None)\n", " \"\"\" Warp img using tx, a matrix representing a geometric transformation.\n", " Pre: tx is 3x3 (or some upper-left slice of a 3x3 matrix). img is grayscale.\n", " Returns: an output image of shape dsize with the warped img\"\"\"\n", " H, W = img.shape[:2]\n", "\n", " # turn a 2x2 or 2x3 tx into a full 3x3 matrix\n", " txH, txW = tx.shape\n", " M = np.eye(3)\n", " M[:txH,:txW] = tx\n", "\n", " Minv = np.linalg.inv(M)\n", " \n", " if dsize is None:\n", " DH, DW = (H, W)\n", " else:\n", " DH, DW = dsize[::-1]\n", " out = np.zeros((DH, DW))\n", "\n", " # your code here\n", " \n", " return out\n", "\n", "```\n" ] }, { "cell_type": "code", "execution_count": null, "id": "7370db80-e40a-41df-b278-a4f8bd125a0e", "metadata": {}, "outputs": [], "source": [ "y1 = imageio.imread(\"../data/yos1.jpg\").astype(np.float32) / 255\n", "y1 = skim.color.rgb2gray(y1)" ] }, { "cell_type": "code", "execution_count": null, "id": "91945844-fa92-4764-8e4e-1152c862c4f5", "metadata": {}, "outputs": [], "source": [ "h, w = y1.shape\n", "\n", "tx = np.eye(3)\n", "tx[:2,2] = [10, 20]\n", "tx[0,1] = 0.1\n", "util.imshow_gray(geometry.warp(y1, tx))" ] }, { "cell_type": "code", "execution_count": null, "id": "eaaa0b3a-df25-411d-b51f-9024c3562797", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 5 }