Means, Medians and Images
Means and medians are some of the first concepts we learn in any basic statistics course. Such concepts are illustrated with several examples, such as: the mean height of the students in the class (or more sensitive information such as age or weight), or the mean/median salary of a company. It is essential to understand these concepts to better understand the world around us. In this post, however, the goal is not to explain neither the concepts, nor why they are important, but to answer the question: what does it have to do with computer science?
A straightforward application is to compute the mean execution time for a given algorithm. Which might help understanding asymptotic notation (e.g. , , ). Don’t worry if you have no idea what asymptotic notation is. I’d like to focus on another application: image processing.
Digital images
Before talking about image processing, let’s review what a digital image is. The most common way to represent an image in a computer is with a table, or matrix, of pixels. A pixel is the representation of a color, in a color image, or intensity, in a grayscale image. We will work with grayscale images for simplicity. Most grayscale images have 256 shades (no, not 50), or levels, varying from black (0) to white (255). A grayscale image is just a table with numbers between 0 and 255:
Move the mouse over Touch the image to see the zoomed pixels.
Using means with images
Back to the statistics. As the image is just a table of numbers, we can compute its mean. For example, the mean of the image above is 109.26. It doesn’t say much, right?
What if we consider the mean of a smaller region of the image? Let’s see what happens when we replace each pixel by the mean of its neighbors (a total of 9 pixels: the pixel itself, and the pixels to the right, left, top, bottom, and diagonals, when they exist) with its own value, we have:
Move the mouse over Touch the image to see the zoomed pixels.
In this case, the mean smoothed the difference between adjacent pixels! Visually, the borders are smoothed and the whole image is blurred and a little washed out. Let’s see what happens when we consider 25 pixels for the mean (2 pixels to each side):
Move the mouse over Touch the image to see the zoomed pixels.
The process of image acquisition and storage isn’t perfect. Thus it’s common to have noise in the image. Let’s consider a simple case: an image with 4 squares, two white squares and two black squares. We’ll add random noise: some pixels that should be black were registered as white and some pixels that should be white were registered as black.
Move the mouse over Touch the image to see the zoomed pixels.
Let’s repeat the process of computing the mean of the 25 neighboring pixels:
Move the mouse over Touch the image to see the zoomed pixels.
As most of the pixels have the same intensity, the mean tends to get close their value. In this case, either black or white. With this, incorrect white pixels turn darker and incorrect black pixels turn brighter as most of their neighbors have the opposite color. In other words, this process can be used to reduce the noise in an image. A collateral effect is the spreading of the noise to it’s neighbors, initially with the correct color. This process of replacing a pixel by its neighbors’ mean value is known as mean filter or average filter.
The set of pixels used to compute the mean is called window. The number of pixels in the window can be larger than 25. There are no limits for neither the size or shape of the window. The following image shows the application of a mean filter with 121 pixels in the window (a 11 x 11 pixels square) on the same image with noise:
Move the mouse over Touch the image to see the zoomed pixels.
Note that with this larger window size, the noise faded even further. However, a side effect is that the borders between the black and white squares is even blurrier than with the 5 x 5 window. Let’s apply the mean filter on our original image with a 11 x 11 window:
Move the mouse over Touch the image to see the zoomed pixels.
In the extreme case we can use a very large window (larger than the image itself). In this case, each pixel will be replaced by the average of all the pixels in the image, in our example, 109. The result will be this image with a single grayscale intensity:
Move the mouse over Touch the image to see the zoomed pixels.
Medians and images
What would happend if we simply replace the mean by the median? Let’s initially consider a property of medians. The median of the set is . It doesn’t matter how large or small the first and last numbers are. Likewise, the number 255 doesn’t affect the computation of the median of , which will still be , no matter what we use instead of . This is not true for the mean. Let’s consider our black and white squares example once more:
Move the mouse over Touch the image to see the zoomed pixels.
What do you think will happen if we replace each pixel by the median of its 5 by 5 neighbor window? Since the majority of the pixels in this image are black. The median will likely be 0. Which means the result will be:
Move the mouse over Touch the image to see the zoomed pixels.
This example shows an interesting property of median filters. While a mean filter blurs the whole image, smoothing the borders, a median filter preserves borders. Now, remember our original image? Let’s sprinkle some wrong pixels on it (we picked some pixels randomly and replaced them by 255 minus their value):
Move the mouse over Touch the image to see the zoomed pixels.
Let’s see what happens when we apply the median filter to the noisy image (using a window of 3 x 3 pixels):
Move the mouse over Touch the image to see the zoomed pixels.
A disadvantage of using median filters instead of mean filters is its higher computational complexity. While the computation of the mean is linear (), where is the number of pixels (we just have to sum all values and divide by ), the computation of the median takes (we have to sort the pixel values to return the central one). But for small window sizes (e.g. 3), it is pretty much constant.
Another way to understand the effect of mean filters on images uses the concept of Fourier transform. I find this explanation particularly beautiful, but let’s leave it for a future post.