Winter 2026

Look at the image from very close and then very far. What do you see?
Assigned: Tuesday, January 6th, 2026
Code Due: Tuesday, January 13th, 2026 at 10pm (via Github)
Artifact Due: Wednesday, January 14th, 2026 at 10pm (via Github)
This assignment will be done individually.
The objective of this project is to efficiently implement image filtering (i.e., convolution) and demonstrate its use in a couple of visually interesting image processing applications. Completion of this project satisfies part of the first course-level learning objective:
Demonstrate a thorough understanding of photometric […] image transformations, including convolution […]
Along the way, this project will help you gain the following skills and knowledge, which will be useful throughout the remainder of the course:
Ability to work with image data represented as multidimensional arrays in Python
Ability to write efficient vectorized code for image processing operations
Understanding of spatial frequency and how it can be manipulated using filters.
This assignment explores two applications of image filtering using convolution.
The first application is to create hybrid images like the one shown above using a simplified version of this SIGGRAPH paper by Oliva, Torralba, and Shyns from 2006. The basic idea is that high frequency tends to dominate perception when it is available, but, at a distance, only the low frequency (smooth) part of the signal can be seen. By blending the high frequency portion of one image with the low-frequency portion of another, you get a hybrid image that leads to different interpretations at different distances. You will implement filtering routines in numpy, then use them to create your own hybrid images.
The second application is detail enhancement using a Laplacian pyramid. Making use of the same filtering routines from the first part, you’ll write code to build a laplacian pyramid for an image, then reconstruct the image with each layer scaled up or down to allow you to accentuate or attenuate different frequency content in the image.
You will find the Project 1 - Github Classroom invitation link on the
Resources tab of the course hub. Click this link to accept the
invitation and create your personal repository for this project. Your
repository already contains skeleton code, including a user interface
for creating hybrid images (hybrid_gui.py) and a UI for
Laplacian detail enhancement (laplacian_gui.py). You will
complete several functions in filtering.py that implement
the functionality used by the UI programs. The next section walks you
through each function. Please keep track of the approximate number of
hours you spend on this assignment, as you will be asked to report this
in hours.txt when you submit.
Start by git cloneing your project repository onto your
local machine.
The project’s software dependencies are managed using
uv, a modern Python package mangement system. You can find
more info, including installation instructions here: Installing
uv. I’m on a mac, so I used the brew method, but other
methods should also work fine.
Most dependencies are pure python and can be managed within
uv. On some systems, you may need to separately install TK
support for Python (this was true in my case). To find out, you can
run:
python3 -c "import tkinter"
If this runs without error, you should be set. Otherwise, you may need to install a package with your system’s package manager; on a Mac with homebrew, this looks like:
brew install python-tk
You should now be able to run the following:
uv run hybrid_gui.py
You should see uv doing some environment setup, then a
Graphical User Interface window will pop up. If you see a window with
some buttons, chances the dependencies and setup are all working.
For just this assignment, you are forbidden from using any built-in
functions from Numpy, Scipy, OpenCV, or other libraries that pertain to
filtering and resizing. This
limitation will be lifted in future assignments, but for now, you should
use for loops, vectorized numpy operations and
slicing/indexing. Basic math operations like np.sum are
fine. If you’re not sure whether something is permitted, just ask.
Your first step is to implement the basic filtering routines to perform 2D discrete cross-correlation and convolution. You will implement the following five functions:
cross_correlation_2dThis will be the workhorse for much of the remaining functionality. Take a look back at the lecture notes if you need a reminder of the definition of cross-correlation. In this implementation, we’re using the “same” output size and zero padding to fill in values outside the input image.
The cross_correlation_2d function is computationally
intensive: filtering an image of size M x N with a kernel of size K x K
is an \(O(MNK^2)\) operation. For
arbitrary kernels, this is unavoidable without using Fourier domain
tricks that we haven’t covered. However, numpy’s array processing
routines are highly optimized and allow for huge speedups of array
operations relative to Python for loops, which must be
executed line by line by the Python interpreter.
As usual, you should focus on getting a correct solution first. I strongly encourage you to write a slow version with as many nested for loops as you need. Then, see if you can eliminate some of the nested loops by batching computations with numpy array operations. Because the rest of the assignment depends heavily on this function, it’s worth some effort to optimize it. One way to go about this is to look in the code for computations that could be batched together as array operations. Another would be to play around with the equation for calculating cross correlation and try rearranging terms to minimize repetition.
A full-credit solution will use only two for loops to
loop over the values in the kernel (not the image), for
a total of only 9 python for loop iterations given a 3x3 kernel. That
said, most of the efficiency points are awarded for an asymptotically
efficient approach (see the rubric for details on the efficiency
points). Try not to sacrifice readability: make sure your approach is
well-commented if you’re making any nontrivial optimizations.
convolve_2dYou should make use of your cross-correlation function to implement this in a small few lines.
gaussian_blur_kernel_2dThis function generates a Gaussian blur filter of a given size.The coordinate system of a filter places (0,0) at the middle of its center pixel, and pixel centers are assumed to be spaced 1 unit apart. Evaluate the Gaussian function (given in the lecture slides) with the given \(\sigma\) for the position of each pixel in a filter of the given dimensions.
We’d like our filter values to sum to one; meanwhile, one property of a Gaussian function is that its integral over the entire domain is one. This means if our filter has finite size, its values won’t (quite) sum to 1. Because we want to preserve overall image brightness, you should re-normalize the values in your Gaussian kernel so that they do sum to exactly 1.
low_passRecall that a low-pass filter leaves lower frequencies alone while attenuating high frequencies. This is exactly what blurring does, so using the functions you’ve already implemented makes this one pretty short.
high_passA high-pass filter does the opposite of a low-pass filter: it preserves high frequencies while eliminating low frequencies. We could achieve this with a single filter, but it’s easier and barely less efficient to simply subtract the low-pass image from the original to get the high-pass result.
test.py provides you with a non-exhaustive set of unit
tests that you may find helpful for debugging. Run it with
uv run test.py.
Hybrid images like the one at the top of this page are made by combining two different images, one low-pass filtered and one high-pass filtered. For example, the image at the top of the page is generated from these two images:

Adding the two together gives you the image at the top of this page. One easy way to visualize the effect is to view progressively downsampled versions of the image (i.e., a Gaussian pyramid):

You can also use your browser’s zoom functionality to resize images when viewing them in a web browser.
The function create_hybrid_images has been implemented
for you. Its parameters permit the caller to choose whether each image
is high- or low-pass filtered, as well as the sigma and kernel size for
each filter.
The file hybrid_gui.py is a program that lets you see
the results of creating hybrid images, and play with parameters
interactively. This is where it becomes really nice to have your
cross-correlation running blazing fast.
The GUI allows you to load two images, align them, then combine
different frequency content from each image into a single hybrid image.
Start by clicking “Load First Image”, loading
resources/dog.jpg, then “Load Second Image”, and load
resources/cat.jpg. Now, you’ll click three points on each
image to help the program align them. Click the dog’s left eye, right
eye, and tongue (in that order). Then click the cat’s left eye, right
eye, and nose. Now click “View Hybrid”, where you’ll see the combined
image in the middle. You can now play around with the filter size and
blur width (sigma) of each filter.
You can also save and load correspondences and filter configurations. A preset that gets you something somewhat close to the image at the top of this page can be loaded with the following command:
uv run hybrid_gui.py -t resources/sample-correspondence.json -c resources/sample-config.json
Note: The result from this preset will not match the result on this webpage (which comes directly from the SIGGRAPH paper) because our implementation is a simplified version than theirs.
In class we talked about how Laplacain pyramids can be used to separate out different slices of frequency content in an image. One straightforward application of these is to apply weights to independently boost or attenuate specific levels of the pyramid when reconstructing. For example, I took this image of an undisclosed beach in the vicinity of Bellingham:

I built a 7-level Laplacian pyramid, then chose weights that were less than one for the two lowest-frequency levels, greater than one for two middle-frequency levels, and less than one for the highest three. Reconstructing with the levels weighted this way results in an image with the low and high frequencies muted, and the mid-frequency contrasts enhanced:

Your task in this section is to implement the two functions
construct_laplacian and reconstruct_laplacian.
For an overview of how this is done, take a look back at the lecture
notes. The sections below specify the differences from the basic version
presented in lecture.
To keep things simple, this function assumes that the dimensions of your image are divisible by 2 enough times to do simple 2x down- and up-sampling for each level of the pyramid without any integer division roundoff error in dimensions.
construct_laplacianIn lecture, we computed the high-pass image as
L_i = f - blur(f). The problem with this approach is that
when we go to reconstruct, the upsampled image that will get added to
the current high-pass image doesn’t exactly equal blur(f),
and so L_i + upsample(rec) won’t exactly equal the original
f (here, rec refers to the thus-far
reconstructed image from the next smaller level of the pyramid). We can
solve this by tweaking the algorithm a little to compute the high-pass
image based on exactly what we’ll have when reconstructing,
namely, upsample(rec). Instead of
L_i = f - blur(f), we’ll instead do the down and upsampling
up front so we can save the precise difference:
L_i = f - upsample(subsample(blur(f))).
You will probably find it helpful to make some helper methods here
for down- and up-sampling. For best results, I recommend using the blur
filter proposed in the original
paper, which is a separable filter built from the following
approximation of a 1D Gaussian:
[0.0625, 0.25, 0.375, 0.25, 0.0625]. Whatever filter you
use, you will need to use the same one for downsampling and upsampling
(reconstruction) in order to achieve an accurate reconstruction.
reconstruct_laplacianThe reconstruction procedure follows the one presented in lecture
with one modification: each level of the pyramid can be multiplied by a
scalar weight before being added back into the result image. This allows
for independent manipulation of frequency slices independently. The
function takes a weights parameter that can be
None, in which case the weights are assumed to be all 1, or
a list containing one weight per level of the pyramid.
When both methods are implemented, you should be able to call
reconstruct on the output of construct with
weights=None and get a visually identical image back (there
may be imperceptible differences due to compression and
quantization).
A barebones GUI program is provided in laplacian_gui.py
that allows you to interactively edit images using your Laplacian
pyramid function. You can run the GUI with no arguments and load an
image using the button. You can also specify an input image to load
immediately with the --image (-i) flag, and a
number of pyramid levels to compute (the default is 5) with the
--levels (-l) flag. For example, running:
uv run laplacian_gui.py -i resources/beach.jpg -l 7
loads the beach image and allows you to edit it using a 7-level pyramid. As with the hybrid GUI, you can save out the edited image with the Save Image button.
Now that you’ve made some nifty image editing tools, use them to make some cool images. Find your own source images and create your own hybrid image. I suggest reading Section 2.2 of the paper and taking a look at their hybrid images gallery for guidance and inspiration on what kinds of image pairs make good hybrids.
Also pick a photo and use the Laplacian pyramid editor to edit different frequency bands to create an an intersesting result.
I will collect the artifacts into a showcase webpage and the class will get the opportunity to vote on their favorites. The top one or two artifacts will receive a nominal amount of extra credit.
hours.txt in your repository. On the
first line of this file, include a single integer estimate of the number
of hours you spent working on this assignment. Below that, you may
optionally include a reflection on how it went, anything you found
particularly confusing or helpful, and/or suggestions for improving the
assignment.hours.txt) to your
P1 repository to github, by the code deadline.hybrid.jpg, your hybrid image artifact, and
laplacian.jpg, your Laplacian edited artifact, to the base
root of your project repo, and push by the artifact deadline. Note that
the artifact deadline is one day later than the code deadline.Points are awarded for correctness and efficiency, and deducted for issues with clarity or submission mechanics.
| Correctness (30 points) | |
|---|---|
| Filtering (20 points) | Correctness as determined by automated tests. |
| Laplacian construction (4 points) | Construction produces a correct Laplacian pyramid. |
| Laplacian reconstruction (4 points) | Unweighted reconstructed image is visually identical to the original. |
| Laplacian editing (2 points) | Reconstruction correctly applies weights to individual levels. |
| Efficiency (16 points) | |
| 10 points | Filtering routines are asymptotically efficient |
| 3 points | cross_correlation_2d uses vectorization to avoid
quadruply-nested for loops |
| 3 points | cross_correlation_2d uses no more than 2 nested python
loops that traverse over the kernel, not the image |
| Artifacts (2 points each) | Artifacts are submitted via git |
Clarity Deductions for poor coding style may be made. Please see the syllabus for general coding guidelines. Up to two points may be deducted for each of the following:
Part 1 of this assignment is based on versions developed and refined by Noah Snavely, Kavita Bala, James Hays, Derek Hoiem, and numerous underappreciated TAs.