Jigsaw Puzzle Solving

Porter Westling, Derek Campbell, and Teddy Wang


Introduction

         The goal of this project was to create a program to solve simple jigsaw puzzles like the one pictured above. We accomplished this goal through the following steps:


Smoothing and Thresholding

         Before we run the edge chain finder and the image stitcher, it is important to preprocess the image with smoothing and thresholding in order to achieve better results at later stages. For the smoothing process, we implement a Gaussian filter to smooth the image. In terms of the sigma value of the Gaussian filter, usually we take 3 or 4 but sometimes we need to use higher values if the edges of the jigsaw puzzle pieces are not very smooth. The reason why we need to use smoothing for the image is because we want to reduce the irregularities on the edges of the jigsaw puzzles in the original image. With those irregularities, it would be difficult for us to segment the jigsaw puzzles from the background and also it would be hard to find concave and convex corners correctly. For the thresholding process, we threshold the image so that every pixels above a certain threshold value are changed into black pixels. The value we used for threshold was around 185 for most of the cases and higher if there are many white pixels in the jigsaw puzzle pieces. With thresholding, it would be much easier for us to identify the location of the jigsaw puzzle pieces and run the segmentation process.


Edge Detection and Matching

         We used a recursive edge "chaining" method in order to determine the edges of each puzzle piece and to record the shape of each piece. The chaining method searches the thresholded image by scan lines until it encounters a piece pixel with a non-piece pixel neighbor in its four-neighborhood. It then recursively finds piece pixel neighbors of the last found pixel with non-piece pixels in their four-neighborhood, testing in a counter-clockwise rotation to ensure that all edge chains are recorded going in the same direction. As each pixel is added to the edge chain, it is marked as belonging to chain 1,2, 3 etc. both so that that chain would not be found on future passes and for use by our segmentation program. These markings are stored as gray values in a modified threshold image. Once we have created these edge chains, we normalize them based on distance both to reduce the size of later calculations and to remove potential errors caused by aliased lines being "longer" than non-aliased lines of the same Euclidean length. We normalize by including the entry at the front of the chain, and then including the first pixel along the chain which is past a certain Euclidean distance from the last included pixel. This cuts the size of the computed chains to as little as 1/10 of their original size and helps to compute better matches as the process does some amount of smoothing to the chain values. Using these normalized chains, we then compute the "angle" at each point on each edge chain. The angles are computed as the exterior angle of the puzzle piece at that point. To compute the angle, we calculate the two vectors from the target edge pixel to each of its neighbors and find the angle between these vectors. These angles are then normalized by subtracting 180 degrees, so that an angle of "0" represents a straight line.
         To find matches, we then compare the edge chains with various offsets, in essence rotating each piece against each other trying to find matches. In order to find matches, we start with the starting points on each chain (determined by the offset currently being used) and "integrate" the curve back from the stored angles by adding the angles as we travel down the chain. This method allows us to use a rotation-invariant way of storing the curves (the exterior angle at each point) while also making sure that the matching process accounts for the actual shape of the curve, and not just the positions of sharp changes. At each point, we determine the error of the match for that point as the absolute value of the sum of the base piece angle and the matching piece angle (a perfect match should sum to zero). We sum the errors as we go. For each piece, we record the current best match found based on these error values, and then keep one match for each piece. By keeping one match for each piece, and ensuring that no two matches connect the same two pieces, we ensure that our matches span the puzzle, and allow for a full solution. We then store the best computed matches as columns of matching points in a test value, similar to the way in which SIFT matches were stored for our image warping programs.


Image separation and Stitching

         Before the pieces of the puzzle are stitched together, they are placed in their own images. The individual images have the same dimensions as the input images so that the coordinates of the matched points will still be meaningful. Our function for separating the pieces into their own images uses the color values from the original input image, but uses the modified threshold image to actually choose which pixels to place in the separate images. In the modified threshold image, the perimeter of each piece is given a unique gray value, while the interior of the piece is black and the background is white. The separation is performed with a for loop over the number of puzzle pieces as determined by the edge-finding program that creates the modified threshold image. Each time through the loop, our program runs through the pixels of the thresholded image until a gray value with intensity corresponding to the current value of the for loop counter is discovered. This represents the edge of the piece to be separated out. There are many cases to be considered, but in general, our implementation starts sampling pixels from a scanline when it encounters and edge, and stops when it encounters an edge again (i.e. the other side of the piece). Occasionally there will be stray white pixels in the interior of our pieces; this does not affect the separation in most cases. Our separator depends heavily on the edge finding code producing a clean result; if the modified threshold image does not set the intensity of every piece perimeter pixel to the appropriate gray value, the separator will produce errors ranging from stray horizontal lines of color to completely inaccurate slicings of the original image.

         To stitch the separated images together, we use a modification of the stitching code from the panorama stitching project. Our edge matcher produces a list of points to be matched comparable to the output of David Lowe's SIFT algorithm. This list is a four column text file in which each row represents the two points to be matched. The list is passed to NumPy's linear least-squares solver, which computes the affine transform associated with the matched points. We chose to use an affine transform instead of a pure rotation translation matrix both for the ease of implementation and because the affine transforms seemed to fit pieces quite snugly together with minor errors in our matched points. Once the affine transform is computed for all the matches found by our matching program, the images containing the individual pieces are stitched together. One piece is chosen as the 'base' piece; this piece is not transformed during the process. Any pieces that have a direct transform onto the base piece are warped into the image. Pieces that do not have a direct transform (i.e. pieces that do not touch the base piece in the solved puzzle) are transformed by a product of transforms; for example, if piece 1 is the base, piece 2 matches piece 1, and piece 3 matches piece 2 but not piece 1, the transform for 3 into 1 will be the product of the transforms from 3 into 2 and 2 into 1. For every match transforms are computed in both directions; this ensures that a path can be found to the base piece for any piece in the puzzle.

         The image will be resized during the stitching process to accommodate the images being warped into the base image. If at any point pixels are added to the left or top of the image, the (x,y) position of the base image will be shifted. The amount of this shift is calculated and factored into the translation portion of all the affine matrices that are computed.


Finished Product



Other Puzzle Arrangements