Edge Detection and Image Segmentation
Last lab session, we made programs that detected edges found in images and segmented others depending on a region we were interested in.
In this activity, we defined edges to be local regions in an image where a significant shift in brightness occurs [1]. Having said that, we can consider images to be a scalar field having particular brightness value at every pixel.
How do we then know when there is a significant shift in brightness, i.e. there is an edge? Since we consider an image to be a scalar field, then to know if there is a shift in brightness, we just take the gradient of that scalar field. How do we then take the gradient of an image? Simple, convolve the image with operators.
In detecting edges, we made use of two operators, the Prewitt \(P\) and Sobel \(S\) edge operators, given as:
\begin{equation}\begin{split}\text{Prewitt}\hspace{1.25cm}P_x=\begin{bmatrix}-1 & -1 & -1 \\ 0 & 0 & 0 \\ 1 & 1 & 1\end{bmatrix} \hspace{0.75cm} P_y=\begin{bmatrix}-1 & 0 & 1 \\ -1 & 0 & 1 \\ -1 & 0 & 1\end{bmatrix} \\ \text{Sobel} \hspace{1.25cm} S_x=\begin{bmatrix}-1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1\end{bmatrix} \hspace{0.75cm} S_y=\begin{bmatrix}-1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1\end{bmatrix}\end{split}\end{equation}
Note that these operators act only in one direction, either horizontally \(x\) or vertically \(y\). Thus, given an image \(I\), if we convolve it to the horizontal Prewitt edge operator \(P_x\), shown as \(P_x*I\), we would extract only the horizontal edges found in the image. Further, if we convolve \(I\) to the vertical Sobel edge operator \(S_y\), we would extract only the vertical edges found in the image. Thus to obtain all edges, we evaluate:
\begin{equation}G=\sqrt{G_x^2+G_y^2}\end{equation}
where \(G_{x,y}=O_{x,y}*I\), \(O\in\{P,S\}\).
Applying this to a binary image that has purely horizontal or vertical edges we get Figure 1.
As expected, we get only the horizontal edges when using \(O_x\), \(O\in\{P,S\}\) and only vertical edges when using \(O_y\), \(O\in\{P,S\}\).
What happens if we use a picture with edges that are not purely horizontal or vertical? Doing so gives us Figure 2.
Notice that since the image doesn't contain purely horizontal or vertical edges, then applying the horizontal and vertical edge operators give us most, if not all edges. Sadly, you won't get purely diagonal edges using horizontal or vertical edge operators.
Things would be interesting if we tried using this technique on an actual image. Doing so, we get Figure 3.
Nice, it was able to get all of the edges found in an actual image.
Since we weren't able to detect diagonal edges, we proposed one way to do so. Since elements in the matrix for horizontal and vertical edge operators were also oriented in one way, what if in order to get diagonal edges, we rotate the elements, so that we get:
\begin{equation}\begin{split}P_x^\prime=\begin{bmatrix}-1 & -1 & 0 \\ -1 & 0 & 1 \\ 0 & 1 & 1\end{bmatrix} \hspace{0.75cm} P_y^\prime=\begin{bmatrix}0 & 1 & 1 \\ -1 & 0 & 1 \\ -1 & - 1 & 0\end{bmatrix} \\ S_x^\prime=\begin{bmatrix}-2 & -1 & 0 \\ -1 & 0 & 1 \\ 0 & 1 & 2\end{bmatrix} \hspace{0.75cm} S_y^\prime=\begin{bmatrix}0 & 1 & 2 \\ -1 & 0 & 1 \\ -2 & - 1 & 0\end{bmatrix}\end{split}\end{equation}
Applying these operators to our previous images, we get Figure 4.
Still, using the edge operators we suggested gives us most of the edges, thus sadly we cannot truly isolate diagonal edge. This can be attributed to the fact that we should've defined diagonal edges more. What makes an edge diagonal?
The last thing we did in this activity was to segment an image. In segmenting the image, we made use of the NCC or normalized chromaticity coordinate representation of an image. Given the RGB representation of an image which says that each pixel is given as \((R,G,B)\) with \(R\), \(G\), and \(B\) the red, green, and blue channel value of that pixel, our NCC representation says that each pixel is given as \((r,g,b)\) where:
\begin{equation}r=\dfrac{R}{R+G+B} \hspace{1cm} g=\dfrac{G}{R+G+B} \hspace{1cm} b=\dfrac{B}{R+G+B}\end{equation}
[2].
In segmenting an image, we first found its NCC representation and chose a monochromatic region of interest, or ROI. Then, we obtained the probability distribution of all the \(r\) and \(g\) values in the ROI, \(P(r)\) and \(P(g)\). From this, we got a two dimensional probability distribution function, or PDF, given by \(P(r,g)=P(r)P(g)\). Now, we map every \((r,g,b)\) value of the original image onto another image using \(P(r,g)\), giving us a singular value depending on the value of \(P(r,g)\). In other words, given that one pixel in our original image has NCC representation \((r_0,g_0,b_0)\), to get the other image, we simply evaluate \(P(r_0,g_0)\).
The thing that makes image segmentation unique is how to get \(P(r,g)\), or \(P(r)\) and \(P(g)\). We can do it parametrically or nonparametrically. Doing it parametrically entails us to get the mean and standard deviation of the \(r\) and \(g\) values in the ROI, and assuming that \(P(r)\) and \(P(g)\) is Gaussian, i.e. follows:
\begin{equation}P(r)=\dfrac{1}{\sqrt{2\pi\sigma_r^2}}\mathrm{e}^{-(r-\langle r\rangle)^2/2\sigma_r^2}\end{equation}
with \(\langle r\rangle\) and \(\sigma_r\) being the mean and standard deviation of \(r\)-values. On the other hand, doing it nonparametrically entails us to simply get a normalized histogram of the \(r\) and \(g\) values in the ROI.
We applied these techniques to an image of a flower because flowers are the best subjects in image segmentations as they have very uniques colors compared to the rest of the plant. Applying these techniques and visualizing the PDF used for parametric and nonparametric segmentation, we get Figure 5.
Notice the 2D PDF, or \(P(r,g)\), used for parametric segmentation includes the color of the flower, whereas that for nonparametric segmentation doesn't. This will greatly affect the resulting segmented image, which gives us Figure 6.
As expected, we would get the flower and part of the ground when segmenting the image parametrically and not get it when doing so nonparametrically. If we purely wanted the flower, we should've blacked out the regions in the ROI, but since that would change everything more, this is the best we got. Really nice!
References:
[1] R. C. Gonzales, and R. E. Woods, Digital Image Processing (Prentice Hall, USA, 2002).
[2] n. a.. CIE's XYZ Coordinate System. (Web). Accessed on 10/21/18 at http://fourier.eng.hmc.edu/e180/lectures/color1/node25.html
In this activity, we defined edges to be local regions in an image where a significant shift in brightness occurs [1]. Having said that, we can consider images to be a scalar field having particular brightness value at every pixel.
How do we then know when there is a significant shift in brightness, i.e. there is an edge? Since we consider an image to be a scalar field, then to know if there is a shift in brightness, we just take the gradient of that scalar field. How do we then take the gradient of an image? Simple, convolve the image with operators.
In detecting edges, we made use of two operators, the Prewitt \(P\) and Sobel \(S\) edge operators, given as:
\begin{equation}\begin{split}\text{Prewitt}\hspace{1.25cm}P_x=\begin{bmatrix}-1 & -1 & -1 \\ 0 & 0 & 0 \\ 1 & 1 & 1\end{bmatrix} \hspace{0.75cm} P_y=\begin{bmatrix}-1 & 0 & 1 \\ -1 & 0 & 1 \\ -1 & 0 & 1\end{bmatrix} \\ \text{Sobel} \hspace{1.25cm} S_x=\begin{bmatrix}-1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1\end{bmatrix} \hspace{0.75cm} S_y=\begin{bmatrix}-1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1\end{bmatrix}\end{split}\end{equation}
Note that these operators act only in one direction, either horizontally \(x\) or vertically \(y\). Thus, given an image \(I\), if we convolve it to the horizontal Prewitt edge operator \(P_x\), shown as \(P_x*I\), we would extract only the horizontal edges found in the image. Further, if we convolve \(I\) to the vertical Sobel edge operator \(S_y\), we would extract only the vertical edges found in the image. Thus to obtain all edges, we evaluate:
\begin{equation}G=\sqrt{G_x^2+G_y^2}\end{equation}
where \(G_{x,y}=O_{x,y}*I\), \(O\in\{P,S\}\).
Applying this to a binary image that has purely horizontal or vertical edges we get Figure 1.
As expected, we get only the horizontal edges when using \(O_x\), \(O\in\{P,S\}\) and only vertical edges when using \(O_y\), \(O\in\{P,S\}\).
What happens if we use a picture with edges that are not purely horizontal or vertical? Doing so gives us Figure 2.
Notice that since the image doesn't contain purely horizontal or vertical edges, then applying the horizontal and vertical edge operators give us most, if not all edges. Sadly, you won't get purely diagonal edges using horizontal or vertical edge operators.
Things would be interesting if we tried using this technique on an actual image. Doing so, we get Figure 3.
Figure 3: (a) Original image and resulting edge detected by the (b) Prewitt and (c) Sobel edge operators. |
Nice, it was able to get all of the edges found in an actual image.
Since we weren't able to detect diagonal edges, we proposed one way to do so. Since elements in the matrix for horizontal and vertical edge operators were also oriented in one way, what if in order to get diagonal edges, we rotate the elements, so that we get:
\begin{equation}\begin{split}P_x^\prime=\begin{bmatrix}-1 & -1 & 0 \\ -1 & 0 & 1 \\ 0 & 1 & 1\end{bmatrix} \hspace{0.75cm} P_y^\prime=\begin{bmatrix}0 & 1 & 1 \\ -1 & 0 & 1 \\ -1 & - 1 & 0\end{bmatrix} \\ S_x^\prime=\begin{bmatrix}-2 & -1 & 0 \\ -1 & 0 & 1 \\ 0 & 1 & 2\end{bmatrix} \hspace{0.75cm} S_y^\prime=\begin{bmatrix}0 & 1 & 2 \\ -1 & 0 & 1 \\ -2 & - 1 & 0\end{bmatrix}\end{split}\end{equation}
Applying these operators to our previous images, we get Figure 4.
Still, using the edge operators we suggested gives us most of the edges, thus sadly we cannot truly isolate diagonal edge. This can be attributed to the fact that we should've defined diagonal edges more. What makes an edge diagonal?
The last thing we did in this activity was to segment an image. In segmenting the image, we made use of the NCC or normalized chromaticity coordinate representation of an image. Given the RGB representation of an image which says that each pixel is given as \((R,G,B)\) with \(R\), \(G\), and \(B\) the red, green, and blue channel value of that pixel, our NCC representation says that each pixel is given as \((r,g,b)\) where:
\begin{equation}r=\dfrac{R}{R+G+B} \hspace{1cm} g=\dfrac{G}{R+G+B} \hspace{1cm} b=\dfrac{B}{R+G+B}\end{equation}
[2].
In segmenting an image, we first found its NCC representation and chose a monochromatic region of interest, or ROI. Then, we obtained the probability distribution of all the \(r\) and \(g\) values in the ROI, \(P(r)\) and \(P(g)\). From this, we got a two dimensional probability distribution function, or PDF, given by \(P(r,g)=P(r)P(g)\). Now, we map every \((r,g,b)\) value of the original image onto another image using \(P(r,g)\), giving us a singular value depending on the value of \(P(r,g)\). In other words, given that one pixel in our original image has NCC representation \((r_0,g_0,b_0)\), to get the other image, we simply evaluate \(P(r_0,g_0)\).
The thing that makes image segmentation unique is how to get \(P(r,g)\), or \(P(r)\) and \(P(g)\). We can do it parametrically or nonparametrically. Doing it parametrically entails us to get the mean and standard deviation of the \(r\) and \(g\) values in the ROI, and assuming that \(P(r)\) and \(P(g)\) is Gaussian, i.e. follows:
\begin{equation}P(r)=\dfrac{1}{\sqrt{2\pi\sigma_r^2}}\mathrm{e}^{-(r-\langle r\rangle)^2/2\sigma_r^2}\end{equation}
with \(\langle r\rangle\) and \(\sigma_r\) being the mean and standard deviation of \(r\)-values. On the other hand, doing it nonparametrically entails us to simply get a normalized histogram of the \(r\) and \(g\) values in the ROI.
We applied these techniques to an image of a flower because flowers are the best subjects in image segmentations as they have very uniques colors compared to the rest of the plant. Applying these techniques and visualizing the PDF used for parametric and nonparametric segmentation, we get Figure 5.
Figure 5: (a) Original image we want to segment, (b) region of interest, and probability distributions used in applying (c)-(d) parametric and (e)-(f) nonparametric segmentation. |
Notice the 2D PDF, or \(P(r,g)\), used for parametric segmentation includes the color of the flower, whereas that for nonparametric segmentation doesn't. This will greatly affect the resulting segmented image, which gives us Figure 6.
As expected, we would get the flower and part of the ground when segmenting the image parametrically and not get it when doing so nonparametrically. If we purely wanted the flower, we should've blacked out the regions in the ROI, but since that would change everything more, this is the best we got. Really nice!
References:
[1] R. C. Gonzales, and R. E. Woods, Digital Image Processing (Prentice Hall, USA, 2002).
[2] n. a.. CIE's XYZ Coordinate System. (Web). Accessed on 10/21/18 at http://fourier.eng.hmc.edu/e180/lectures/color1/node25.html
Comments
Post a Comment