Project 2: Fun with Filters and Frequencies!

By Ethan Zhang

Overview

Using a whole lot of Gaussian filters, we are able to do achieve various image manipulations like edge detection, image sharpening, hybrid images, and multi-resolution blending.

Part 1: Fun with Filters

Part 1.1: Finite Difference Operator

For this part, we explore using the dx, dy finite difference filters to generate a gradient magnitude image. Since a large dx / dy in an image implies regions of large contrast, the gradient magnitude image can be threshold-ed to highlight the edges within an image.

First, I obtained an example image of a camera man. Then, I constructed the finite difference filters with Dx = [-1 1] and Dy = [-1 1]^T. Since the gradient magnitude can be derived from the two partial derivatives, it is now possible to combine the two and binarize it (threshold = 0.30) to get the final result. For the threshold, I experimented with values that would include the tripod edges while reducing the noise from the grass ground.

Original

Convolved w/ Dx

Convolved w/ Dy

Grad Magnitude

Final Result

Part 1.2: Derivative of Gaussian (DoG) Filter

Since the image has a bit of noise, the previous image has created some specks as edges (i.e., within the grass). Although it is pretty unavoidable for images to be noisy, it is possible to pre-process it in the way to highlight the sustained edges by smoothing out the high frequencies. To do this, I used a Gaussian filter with an 11x11 kernel and sigma = 1. This set of parameters seemed to smooth the image in a way that preserved most of the long edges while reducing the specks. After this, I was able to regenerate the partial derivatives that reconstructs the gradient magnitude image. For this part, I had to use a smaller threshold (threshold=0.125) because the smoothing had significantly reduced the magnitudes.

Original

Smoothed

Convolved w/ Dx

Convolved w/ Dy

Grad Magnitude

Final Result

As a result of the smoothing, a lot of the specks were gone and the edges were thicker.

In the previous method, I used separate operations to convolve with the Gaussian filter then the Finite Difference operators. However, it is possible to convolve the Finite Difference operators with the Gaussian filters to create DoG filters and reduce the number of convolutions by 1. In the following figure, we see that it does not change the end results at all.

Original

Dx Smoothed

Dy Smoothed

Grad Magnitude

Final Result

Part 2: Fun with Frequencies!

Part 2.1: Image "Sharpening"

In order to "sharpen" images, it is possible to add in additional high frequency components of an image. Specifically, I chose to convolve the image with an "unsharp mask" filter which is a unit impulse filter subtracted by a gaussian filter (which represents the low frequencies).

Using an alpha of 0.7, a 7x7 kernel, and sigma of 1, I created the following set of sharpened images:

taj.png

taj.png (Sharpened)

monastery.png

monastery.png (Sharpened)

cathedral.png

cathedral.png (Sharpened)

Then, for evaluation, I picked a sharp image, then blurred, then sharpened using the same gaussian filters to see if this process might be reversible.

train.jpg

train.jpg (Smoothed)

train.jpg (Sharpened)

After resharpening the image, it seemed like edges in the image became a lot more sharp and there was an overall "sharp" quality. However, the intensity of the colors seem to have diminished (almost blurred in a color sense).

Part 2.2: Hybrid Images

For this part, I tried to create "hybrid images" with the approach described in the SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns. Specifically, it is possible to take advantage of how human perception works by merging the high frequencies of an image and low frequencies of another. Then, it is normal to see the lower frequencies at further distances while we focus on higher frequencies at closer distances.

To tune the hybrid images, separate parameters were created for each gaussian filter used. Then, I also decided to add a "ratio" factor that scales the amount each image is added back to the result.

Using sigma_{low_freq} = 3, sigma_{high_freq} = 6, and ratio = 0.3, I was able to create the following derek—cat image:

DerekPicture.jpg

cat.jpeg

Derek Cat

This cat—dog was created with sigma_{low_freq} = 6, sigma_{high_freq} = 6, and ratio = 0.8:

cat.jpeg

dog.jpeg

Cat Dog

This mona lisa—einstein was created with sigma_{low_freq} = 3, sigma_{high_freq} = 11, and ratio = 0.5:

mona_lisa.jpeg

einstein.jpeg

Mona Lisa Einstein

For this hybrid, I believe the effect doesn't work as well because the facial structures are quite different and the lower frequency's colors overpowers the einstein face.

This happy—angry was created with sigma_{low_freq} = 6, sigma_{high_freq} = 6, and ratio = 0.125:

smile.jpg

mad.jpg

Smile Mad

This process of keeping the high / low frequencies can be demonstrated by the 2D Fourier transform of each stage of the transformation:

FT of smile.jpg

FT of mad.jpg

Low Passed smile.jpg

High Passed mad.jpg

FT of Smile Mad

Bells & Whistles: Colors

In the original hybrid images, like the smile—mad example, was kept fully in gray scale. This reduces the amount of convolution to just one channel but I decided to also experimenting with coloring the two images. Specifically, I ended up coloring just the low_passed transformation of mona lisa—einstein and derek—cat because subjectively, it seemed like it was harder for me to see the low-passed images. By adding color, it made it more visible and I was able to balance the ratio to show more high frequencies for the closer-up view. In one case, the cat—dog, the high frequency dog was a bit harder to see, so I ended up only coloring the high frequencies.

In the end, it seems like either high or low frequencies can be colored depending on the scenario.

Part 2.3: Gaussian and Laplacian Stacks

In order to begin implementing multi-resolution blending, I first implemented the creation of Gaussian and Laplacian stacks. In particular, I applied my stack functions with a kernel size of 11x11 and sigma of 3 to generate the following stacks for oraple.jpg (where the laplacian is the diff element-wise and the last index is the last gaussian).

Part 2.4: Multiresolution Blending (a.k.a. the oraple!)

Then, to implement Multiresolution Blending, I first generate a Gaussian stack of a base mask that I intended to use. To create the Oraple, I chose to use two masks: a half/half vertically separate and a half/half horizontally separate masks. Then, I also generate the Laplacian stacks of the two input images such that it will be possible to combine each layer of the stacks with the ratios set by the level of the mask Gaussian stack.

For the Oraple specifically, I chose to use kernel_filter = 81x81, kernel_{image_gauss} = 5x5, sigma_filter = 25, sigma_{image_gauss} = 5, and N = 20:

apple.jpeg	orange.jpeg	oraple-v.png
	oraple-h.png

For this meteor blend, I created a custom mask for the meteor by binar-izing the image. Then, I used kernel_filter = 81x81, kernel_{image_gauss} = 5x5, sigma_filter = 25, sigma_{image_gauss} = 5, and N = 20:

costco.png	meteor.png	mask.png
	costco-meteor.png

For this eiffel tower / oriental pearl tower blend, I used a horizontal filter with kernel_filter = 81x81, kernel_{image_gauss} = 5x5, sigma_filter = 25, sigma_{image_gauss} = 5, and N = 5:

To illustrate this complete process, I've also displayed the laplacian stacks (gray-scaled) of the two input images and also the gaussian stacks of the result:

oriental level 1	oriental level 2	oriental level 3	oriental level 4	oriental level 5
eiffel level 1	eiffel level 2	eiffel level 3	eiffel level 4	eiffel level 5
result level 1	result level 1	result level 1	result level 1	result level 1

Bells & Whistles: color

Although the process is generally used for a single channel, gray-scaled image, I've decided to add color by running all the convolutions on all three channels (using gray-scale for the debugging output). This did make the background of images blend together more weirdly because of different colors but generally, with similar backgrounds like the oraple, the effect looks quite well.