Lecture 9 - Image Alignment 2; Robust Model Fitting (RANSAC)¶
Announcements¶
- Project 1:
- Github classroom is trying to run the tests for you when you push, but they haven't seen the
uvlight yet. If the tests pass locally, you're in good shape! Sorry for any confusion. - Code due tonight
- artifacts due tomorrow night
- Github classroom is trying to run the tests for you when you push, but they haven't seen the
- Project 2 is out! Due next Tuesday
- Be prepared for it to be a little bigger than project 1 (mainly just more pieces)
- Soliciting ideas and/or image submissions for our class sticker! I need to send something in by the end of this week.
- Today's tea: Four Seasons Si Ji Chun oolong
Goals¶
- Know how to find a least-squares best-fit transformation for:
- homography (with caveats)
- Understand the Random Sample Consensus (RANSAC) algorithm
# boilerplate setup
%load_ext autoreload
%autoreload 2
%matplotlib inline
import os
import sys
src_path = os.path.abspath("../src")
if (src_path not in sys.path):
sys.path.insert(0, src_path)
# Library imports
import numpy as np
import imageio.v3 as imageio
import matplotlib.pyplot as plt
import skimage as skim
import cv2
# codebase imports
import util
import filtering
import features
import geometry
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
Plan¶
- Homography fitting
- Outlier robustness: RANSAC
- Warping: forward and inverse
- Bilinear interpolation
Context: Panorama Stitching Overview¶
- Detect features - Harris corners
- Describe features - MOPS descriptor
- Match features - SSD + ratio test
- Estimate motion model from correspondences
- Translation
- Affine
- Projective
- Robustness to outliers - RANSAC
- Warp image(s) into common coordinate system and blend
- Inverse warping
- Blending
- 360 panoramas?
Recall our definition of the optimal transformation for a given set of correspondences is the one that minimizes the sum of squared residuals: $$ \min_T \sum_i||(T\mathbf{p}_i - \mathbf{p}_i')||^2 $$
Homework Problem 1¶
Write down the $x$ and $y$ residuals for a pair of corresponding points $(x, y)$ in image 1 and $(x', y')$ in image 2 under a homography (projective) motion model. Assume the homography matrix is parameterized as $$ \begin{bmatrix} a & b & c\\ d & e & f\\ g & h & 1 \end{bmatrix} $$
Whiteboard:
- homography residuals
- roadblocks
- duct-tape fixes
Whiteboard: solving homogeneous least squares systems: $$ \min_\mathbf{x} ||A\mathbf{x}|| $$ subject to $$ ||x|| = 1 $$
TL;DM (too long; didn't math):
- Decompose $A$ using the SVD: $$ U_{m \times m}, \Sigma_{m\times n}, V^T_{n \times n} = \mathrm{SVD}(A_{m \times n}) $$
- The optimal vector $x^* = $ the row of $V^T$ (column of $V$) corresponding to the smallest element of $\Sigma$ (which is diagonal)
- Usually your linear algebra library will order things so that $\Sigma$'s elements are in descending order, so in practice $x^* = $ the last row of $V^T$ is the optimal $x*$
Next up: Robustness to outliers¶

RANSAC: RAndom SAmple Consensus¶
Finding a transformation is a model fitting problem. A simple model fitting problem that we'll use as analogy is line fitting (in fact, this is what linear least squares is doing for us, it's just fitting higher-dimensional lines).
Problem statement, for now: Given a set of points with some outliers, find the line that fits the non-outliers best.
Key Idea:
“All good matches are alike; every bad match is bad in its own way.”
-Tolstoy, as misquoted by Alyosha Efros
Observation: If I have a candidate model, I can tell how good it is by measuring how many points "agree" on that model.
Algorithm, take 1:
for every possible line:
count how many points are inliers to that line
return the line with the most inliers
Runtime: O($\infty$)
Algorithm, take 2:
for every line that goes through two of the given points:
count how many points are inliers to that line
return the line with the most inliers
Runtime: O(n^3)
Algorithm, take 3: RANSAC - see whiteboard notes
Homework Problems 2-4¶
- In the inner loop of RANSAC, how many points are used to fit a candidate model if you are fitting a line to a set of 2D points?
- In the inner loop of RANSAC, how many pairs of corresponding points are used to fit a candidate model if you are fitting a translation to a set of correspondences?
- In the inner loop of RANSAC, how many pairs of corresponding points are used to fit a candidate model if you are fitting a homography to a set of correspondences?
Homework Problem 5 (preview)¶
In this problem, we'll analyze the RANSAC algorithm to help us understand how to decide how many iterations to run ($K$). Suppose we are fitting some model that requires a minimal set of $s$ points to fully determine (e.g., $s=4$ matches for a homography, $s=2$ points for a line). We also know (or have assumed) that the data has an inlier ratio of $r = \frac{\text{\# inliers}}{\text{\# data points}}$; in other words, the probability that a randomly sampled point from the dataset as a probability of $r$ of being an inlier.
(the following is included here just in case there's time)
Warping: Forward and Inverse¶
(See whiteboard notes.)
Forward warping:
for x, y in src:
x', y' = T(x, y)
dst[x', y'] = src[x, y]
Inverse warping:
Tinv = inv(T)
for (x', y') in dst:
x, y = Tinv(x', y')
dst[x', y'] = src[x, y]
Homework Problem 6¶
- Complete the following function with Python-esque pseudocode (or working code in the lecture codebase), which performs inverse warping with nearest-neighbor sampling in the source image.
def warp(img, tx, dsize=None)
""" Warp img using tx, a matrix representing a geometric transformation.
Pre: tx is 3x3 (or some upper-left slice of a 3x3 matrix). img is grayscale.
Returns: an output image of shape dsize with the warped img"""
H, W = img.shape[:2]
# turn a 2x2 or 2x3 tx into a full 3x3 matrix
txH, txW = tx.shape
M = np.eye(3)
M[:txH,:txW] = tx
Minv = np.linalg.inv(M)
if dsize is None:
DH, DW = (H, W)
else:
DH, DW = dsize[::-1]
out = np.zeros((DH, DW))
# your code here
return out
y1 = imageio.imread("../data/yos1.jpg").astype(np.float32) / 255
y1 = skim.color.rgb2gray(y1)
h, w = y1.shape
tx = np.eye(3)
tx[:2,2] = [10, 20]
tx[0,1] = 0.1
util.imshow_gray(geometry.warp(y1, tx))