{ "cells": [ { "cell_type": "markdown", "id": "c3109b7e-0092-47a3-aca2-288e745731ba", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## Lecture 13 - Plane Sweep Stereo" ] }, { "cell_type": "markdown", "id": "c0357b50-2962-423e-85f1-80217ec3d134", "metadata": {}, "source": [ "#### Announcements\n", "* Forgot to mention yesterday: project 1 is graded; scores and feedback are in the Feedback pull request on Github.\n", "* Today's tea: Nepal Imperial Black\n" ] }, { "cell_type": "markdown", "id": "38efd636-c09e-4d1f-a828-33e0fcc75686", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Goals\n", "* Understand and be prepared to implement the plane sweep stereo algorithm.\n", "\n", "#### Projective Geometry (not covering)\n", "* Know how to represent lines in projective space.\n", "* Understand how to determine interactions between points and lines in projective space:\n", " * How to check whether a point lies on a line\n", " * How to calculate the line through two points\n", " * How to calculate the intersection of two lines\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ef30e2c7-ba6f-48fa-a5ae-bc02d9f73679", "metadata": {}, "outputs": [], "source": [ "# boilerplate setup\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "%matplotlib inline\n", "\n", "import os\n", "import sys\n", "\n", "src_path = os.path.abspath(\"../src\")\n", "if (src_path not in sys.path):\n", " sys.path.insert(0, src_path)\n", "\n", "# Library imports\n", "import numpy as np\n", "import imageio.v3 as imageio\n", "import matplotlib.pyplot as plt\n", "import skimage as skim\n", "import cv2\n", "\n", "# codebase imports\n", "import util\n", "import filtering\n", "import features\n", "import geometry" ] }, { "cell_type": "markdown", "id": "5e3e7b96-c6b3-4a78-a8c4-c4e13e8b93cc", "metadata": {}, "source": [ "### Plane Sweep Stereo\n", "\n", "Rectified stereo: requires a very particular camera setup. \n", "\n", "What if they're not?\n", "\n", "Two ways around this:\n", "1. **Rectify** images so they appear *as if* the cameras were rectified.\n", "2. **Plane sweep stereo**: Use camera matrices (intrinsics and extrinsics) to reason about depth directly.\n", "\n", "Today we're focusing on plane sweep stereo. The setting:\n", "\n", "* Cameras are *not* rectified, but\n", "* They **are** calibrated, meaning we know:\n", " * $K_L$, $K_R$, the **intrinsics** of the left and right cameras.\n", " * $[R;t]_L$ $[R;t]_R$, the **extrinsics** of the left and right cameras. Note that:\n", " * These are written as 3x4 matrices, with $R_3x3$ augmented with $t_{3x1}$.\n", " * These are the matrices that go from world to camera; these are the inverses of the cameras' coordinate frame matrices.\n", " \n", "\n", "Our standard stereo algorithm was:\n", "```\n", "for i, j in pixels:\n", " for d in disparities:\n", " compute a cost\n", "```\n", "\n", "The Plane sweep stereo algorithm swaps the outer loops:\n", "```\n", "for d in disparities:\n", " for i, j in pixels:\n", " compute a cost\n", "```\n", "\n", "The basic idea is to start with one camera, and:\n", "1. \"Unproject\" each pixel to a hypothesized depth $d$.\n", "2. \"Reproject\" the corresponding 3D points back into the *other* camera.\n", "3. Compute a match score.\n", "\n", "##### HW #1-2\n", "Assume the following are known:\n", "* intrinsics ($K_L, K_R$)\n", "* extrinsics ($[R|t]_L, [R|t]_R$)\n", "\n", "Note that here, we're representing the extrinsics as a $3\\times 4$ augmented matrix $[R|t]$. This is the world-to-camera matrix with the projection built in, such that $\\underset{3\\times 1}{\\mathbf{x}^{cam}} = \\underset{3 \\times 4}{[R|t]} \\underset{4 \\times 1}{\\mathbf{x}^{world}}$\n", "\n", "\n", "**#2**: Given a 3D point $x_w, y_w, z_w$, write an expression for the pixel coordinates $x_{img}^R, y_{img}^R$ of that point in the right camera.\n", "\n", "**#1**: Given a pixel coordinate $x_{img}^L, y_{img}^R$ in the left camera, give an expression for the 3D point at a hypothesized depth $d$​ that would have projected to that pixel location in the left camera." ] }, { "cell_type": "markdown", "id": "7dbcd612-c31c-4a67-af02-e026746dc10b", "metadata": {}, "source": [ "#### The Algorithm\n", "You could imagine doing this for every pixel, but it would be pretty expensive. However, we can make a key observation:\n", "\n", "The \"unproject-reproject\" transformation is a homography!\n", "\n", "We could calculate it directly from the camera matrices (for a given depth) but we'll use a slightly different approach:\n", "1. Unproject the four corners of the left camera's image\n", "2. Reproject the corners onto the right camera's image\n", "3. Fit a homography $H_d$ from the four correspondences generated in (1-2)\n", "4. Warp the left image using $H$\n", "5. Compute NCC matching cost on the entire pair of images at once!\n", " * notice that this now is back to looking like a \"sliding window\" cross-correlation" ] }, { "cell_type": "code", "execution_count": null, "id": "8d33e039-d574-4766-b190-e7ca130e1a8d", "metadata": {}, "outputs": [], "source": [ "\n", "\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "92156250-8e01-4313-8fc7-a30e7a342e63", "metadata": {}, "source": [ "## (end)\n", "\n", "*Note:* We're skipping the following topics for lack of time, but I'm leaving the notes here in case you're interested." ] }, { "cell_type": "markdown", "id": "627c5494-0b4e-44b1-9815-9c613c7b0ce1", "metadata": {}, "source": [ "#### Projective Geometry 1: Points and Lines\n", "\n", "Leading up to the topic of **epipolar** geometry, which describes the geometric relationships between two or more cameras, we're going to start with some fundamentals regarding points and lines in projective space ($\\mathbb{P}^2$).\n", "\n", "The notes have more detail on the following, but here's the plan:\n", "\n", "##### Homogeneous points\n", "\n", "Review, and recall their interpretation as vectors / rays in $\\mathbb{R}^3$.\n", "\n", "##### Homogeneous lines\n", "* Points (0D objects) in $\\mathbb{P}^2$ can be interpreted as rays (1D objects) through the origin in $\\mathbb{R}^3$.\n", "* Lines (1D objects) in $\\mathbb{P}^2$ can be interpreted as **planes** (2D objects) through the origin in $\\mathbb{R}^3$!\n", "\n", "After projection back onto the $w=1$ plane, a plane looks like a line!\n", "\n", "We represent these in homogeneous coordinates also using 3-vectors $\\ell = [a, b, c]$ that represent the **plane normal** in 3D. After projection, this corresponds to the 2D line equation $ax + by + c = 0$.\n", "\n", "##### HW #3-4\n", "\n", "3. Give the slope-intercept form of the line represented by homogeneous coordinates [1, 1, 0].\n", "\n", "4. Give the homogeneous coordinates of the line $y = -2x + 400$." ] }, { "cell_type": "code", "execution_count": null, "id": "9bb881ee-3c9e-4d2f-9dd2-18ee437f601f", "metadata": {}, "outputs": [], "source": [ "# TODO line_4 = \n", "line_4 = [2, 1, -400]\n", "\n", "a, b, c = line_4\n", "x, y = np.mgrid[:300, :300]\n", "\n", "#plt.imshow(a*x + b*y + c)\n", "plt.imshow(a*x + b*y + c > 0)\n", "plt.colorbar()\n" ] }, { "cell_type": "markdown", "id": "b3d5741a-a4cb-4b2b-b3d2-a70b6b3c0fc5", "metadata": {}, "source": [ "#### Points on Lines; Lines through Points\n", "\n", "The condition for whether a point $p$ lies on a line $\\ell$, or equivalently that a line $\\ell$ goes through a point $p$, is, elegantly:\n", "$$\n", "p \\cdot \\ell = 0\n", "$$\n", "\n", "##### HW #5: Show this algebraically\n", "\n", "5. Recall that a homogeneous point $p = [x, y, w]$ represents the 2D coordinates $(x/w, y/w)$, while a line $\\ell = [a, b, c]$ represents the 2D line $ax + by + c = 0$. Show *algebraically* that a line $\\ell = [a, b, c]$ goes through a point $p = [x, y, z]$ if and only if their dot product is zero." ] }, { "cell_type": "markdown", "id": "6822c5d9-d887-425e-b037-5e45ccd9be64", "metadata": {}, "source": [ "#### Point-Line Duality\n", "\n", "The **line** that passes through **two points (2D)** is the **plane** spanned by their two **normal vectors (3D)**.\n", "\n", "You can compute this with the cross product!\n", "\n", "$$\\ell_{pq} = p \\times q$$\n", "\n", "See `geometry.py` for an implementation of the cross product.\n", "\n", "##### HW #6\n", "\n", "6. Use the cross product (not the tedious algebra approach!) to find the homogeneous representation of the line that goes through (70, 70) and (0, 40). Feel free to use software for this one; an implementation of `cross` is in `geometry.py` in the lecture repo." ] }, { "cell_type": "code", "execution_count": null, "id": "95f9c075-abeb-4bde-b486-15513ecf1ba7", "metadata": {}, "outputs": [], "source": [ "# TODO line_6 = ...\n", "line_6 = geometry.cross([70, 70, 1], [0, 40, 1])\n", "\n", "a, b, c = line_6\n", "x, y = np.mgrid[:300, :300]\n", "#plt.imshow(a*x + b*y + c)\n", "plt.imshow(a*x + b*y + c > 0)\n", "plt.colorbar()\n", "line_6" ] }, { "cell_type": "markdown", "id": "df4b8c18-da8c-4393-a56e-a9c9fa492554", "metadata": {}, "source": [ "The **point** at the intersection of **two lines (2D)** is the **vector** that lies within both **planes (3D)**.\n", "\n", "In other words, it's a vector orthogonal to both plane normals.\n", "\n", "You can *also* compute this with the cross product!\n", "\n", "$$p = \\ell_1 \\times \\ell_2$$\n", "\n", "\n", "##### HW #7\n", "\n", "Find the intersection point of the two lines from #4 and #6." ] }, { "cell_type": "code", "execution_count": null, "id": "4b84e1b8-b185-4555-8855-dd71666e6274", "metadata": {}, "outputs": [], "source": [ "intersection = geometry.cross(line_4, line_6)\n", "intersection" ] }, { "cell_type": "code", "execution_count": null, "id": "ab5f1531-f213-4898-8b3f-0f7ebcda4a0c", "metadata": {}, "outputs": [], "source": [ "intersection = intersection / intersection[2]\n", "\n", "\n", "a1, b1, c1 = line_4\n", "a2, b2, c2 = line_6\n", "plt.imshow(np.abs((a1*x + b1*y + c1) * (a2*x + b2*y + c2)) < 1000)\n", "\n", "intersection" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 5 }