Simple Image Enhancement Methods
Last lab session, we were tasked to perform point transformation onto images - where one uses a transformation function \(\mathrm{T}[\ ]\) on an image \(f(x,y)\) to obtain another image \(g(x,y)\), or:
\begin{equation}g(x,y)=\mathrm{T}\left[f(x,y)\right]\end{equation}
[1]. Specifically, we were told to do the following transforms: image negative, logarithmic and gamma transform, contrast stretching and thresholding, and intensity-level slicing.
From [1], we see that given that maximum gray value present in the image \(f(x,y)\) is \(L\), then for all gray level values in \(f(x,y)\) is \(r\), the resulting gray values for \(g(x,y)\) is \(s\), and the image's negative is obtained by:
\begin{equation}s=L-r \label{nega}\end{equation}
whereas for image's logarithmic transform is given by:
\begin{equation}s=c\log(1+r) \label{log}\end{equation}
and for gamma transform is:
\begin{equation}s=cr^\gamma \label{gamma}\end{equation}
To stretch the contrast of an image, we stretch certain gray values \(r_\text{a}\) and \(r_\text{b}>r_\text{a}\) towards the minimum \(r_\text{min}\) and maximum \(r_\text{max}\) gray values present in the image, and let all resulting \(s>255\) be \(255\) and \(s<0\) be \(0\) [2]. In expression form, this is given by
\begin{equation}s=(r-r_\text{min})\left(\dfrac{r_\text{b}-r_\text{a}}{r_\text{max}-r_\text{min}}\right)+r_\text{a} \label{stretch}\end{equation}
[2].
Further, contrast thresholding can be done by setting a threshold \(t\in[0,255]\) such that
\begin{equation}s=\begin{cases}0 & r<t \\ 255 & r\geq t\end{cases} \label{thresh}\end{equation}
[1].
Lastly, to do intensity slicing, one can do two things: (1) to highlight certain gray values while diminishing that of other values, or (2) highlight certain gray values while retaining that of other values [1]. An expression for (1) could be:
\begin{equation}s=\begin{cases}a & r\in[r_\text{a}, r_\text{b}] \\ b & \text{otherwise}\end{cases}\label{intensity}\end{equation}
where \(a<b\), whereas an expression for (2) could be:
\begin{equation}s=\begin{cases}a & r\in[r_\text{a}, r_\text{b}] \\ r & \text{otherwise}\end{cases}\end{equation}
[1].
In my case, I did all these on the image given in Figure 1.
Applying \eqref{nega} on Figure 1, we get Figure 2.
Notice that all the dark regions in Figure 1 became lighter in Figure 2 and vice versa. This makes sense since given \(255\) corresponds to pure white and \(0\) pure black, then \(L-r=255-r\) gives a high gray value when \(r\) is low, i.e. dark and a low gray value when \(r\) is high, i.e. light.
To obtain the logarithmic transform of Figure 1, I manipulated \eqref{log} such that \(c\) would scale the gray values \(\log(1+r)\) such that \(s\in[0,255]\). Searching in the net, I found that \(c\) should be equal to \(c=255/\log(1+r_\text{max})\). With this in mind, and obtaining the logarithmic transform we get Figure 3.
It is interesting to note that Figure 3 appears to be a saturated version of Figure 1, i.e. it is brighter, which made me curious as to why it was so. Plotting \eqref{log}, we get Figure 4.
From it, we see that Figure 3 do make sense since all initial gray values \(r\) are mapped to higher gray values \(s\).
Again, I modified the equation for gamma transform \eqref{gamma} such that \(r\) in \eqref{gamma} becomes \(r/r_\text{max}\) so that \((r/r_\text{max})^\gamma\in[0,1]\) for all \(r\) and \(\gamma\), thus to bring it back to \([0,255]\), \(c=255\) and \eqref{gamma} becomes
\begin{equation}s=255\left(\dfrac{r}{r_\text{max}}\right)^\gamma \label{modgamma}\end{equation}
Applying this to Figure 1 for differing values of \(\gamma\), we get Figure 5.
Notice that for \(\gamma<1\), the resulting image is lighter while those where \(\gamma>1\), the resulting image is darker. This makes more sense if we plot \eqref{modgamma} for the \(\gamma\) used in Figure 5. Doing so we get Figure 6.
From it we observe why resulting images are lighter when \(\gamma<1\) while they are darker when \(\gamma>1\) - because of how they transform \(r\rightarrow s\).
To implement contrast stretching on Figure 1, I used \eqref{stretch} with \(r_\text{b}\) and \(r_\text{a}\) determined by me. It didn't make sense for me to do this on my current image however since it appeared to contain all possible gray levels, but I did it anyway in hopes of getting interesting results.
So, I experimented with different values of \(r_\text{b}\) and \(r_\text{a}\) which were pairs when you divided \(255\) by \(5\), i.e. \(51\), \(102\), \(153\), and \(204\). I did \((51,204)\) and \((102,153)\) pairs, and upon doing so got this:
Notice that as expected, visually there is not much difference between this and Figure 1. However if we inspect the probability distribution functions (PDF) of these images as in Figure 8,
we see that there has been a change in the range of gray values the image has. Plotting \eqref{stretch} with the corresponding pairs we get Figure 9:
Here we see that for the pairs used in contrast stretching, the corresponding range of values \(s\) takes up indeed becomes lesser and so in our case, instead of stretching the contrast, it was decreased - weird.
We now move to contrast thresholding. In this, I simply applied \eqref{thresh} for different values of \(t\). Doing so I got Figure 10.
It is interesting to note that for \(t=51\), the boundary between light and dark region became very visible and while \(t\) increased, more and more boundaries between light and dark regions became visible to the point that when \(t=204\), what is left light is the region at which most light is coming from. This suggests that contrast thresholding may be used to know edges between light and dark regions, and know where light is coming from.
To implement intensity-level slicing, I used \eqref{intensity} with \(a=150\) and \(b=25\) for different pairs of \(r_\text{a}\) and \(r_\text{b}\), which is essentially the same pairs I used in contrast stretching. Doing so, I got Figure 11:
Looking at Figure 11, we see that when \((r_\text{a},r_\text{b})=(51,204)\), the really dark and light parts of the image are highlighted, whereas when \((r_\text{a},r_\text{b})=(102,153)\), that, including some of the other regions in what was white earlier is now black.
We were then told to do histogram equalization, a process which you map the cumulative distribution function (CDF) of your image's gray values to another CDF which you want. Essentially it is finding the gray value corresponding to where the two CDF's have the same or closest value. To do this, I essentially used the code shown below:
Here I just did what I said in code form, and since both CDF's have length \(256\) with indices ranging from \(0\) to \(255\), the index where the desired CDF and CDF of given gray value already gave me the corresponding gray value - nice!
Now, I did this for both linear and quadratic CDFs. And to compare and contrast the input from the output, I showed the pictures together with their PDFs and CDFs. Doing so we get Figure 12.
Notice that the input image appear dark, so as shown in its PDF, most gray values are low, having two distinct peaks, and the CDF appear rapidly changing in some regions while being almost uniform in others. However after applying histogram equalization with a desired CDF being linear, we get Figure 12d. Notice that now it appears brighter with the pots now visible. Note also that it appears smooth and aesthetically pleasing (at least in my opinion). Looking at its PDF however, we noticed that the peaks moved to higher gray values and that more pixels have gray values from 200 to 255, which makes sense because visually the image is brighter. Note also that the PDF has gaps where other gray values are not present - this is because of the nature of the desired PDF so that in those regions, the resulting gray values became much higher. And as expected, the CDF of this image is linear.
If we do histogram equalization on the same image only using quadratic CDF however, we notice that the resulting image is much brighter and the peaks in the PDF have been moved to even higher gray values with more spread and bigger gray values not present in the image, which again may be attributed to the nature of the desired CDF which is quadratic, which is seen to be the image's CDF after equalization. Aesthetically speaking however, the image looks odd with light intensity not looking normalized, but rather saturated, so in my opinion, this suggests that since our eyesight is also nonlinear, we may be able to distinguish dark objects from really dark objects to a certain extent, however we cannot distinguish light objects from very light objects.
References:
[1] R. C. Gonzales, and R. E. Woods. Digital image processing (2nd), Chapter 3 (Prentice-Hall, Inc, New Jersey, 2002)
[2] https://homepages.inf.ed.ac.uk/rbf/HIPR2/stretch.htm
[3] https://homepages.inf.ed.ac.uk/rbf/HIPR2/pixlog.htm
\begin{equation}g(x,y)=\mathrm{T}\left[f(x,y)\right]\end{equation}
[1]. Specifically, we were told to do the following transforms: image negative, logarithmic and gamma transform, contrast stretching and thresholding, and intensity-level slicing.
From [1], we see that given that maximum gray value present in the image \(f(x,y)\) is \(L\), then for all gray level values in \(f(x,y)\) is \(r\), the resulting gray values for \(g(x,y)\) is \(s\), and the image's negative is obtained by:
\begin{equation}s=L-r \label{nega}\end{equation}
whereas for image's logarithmic transform is given by:
\begin{equation}s=c\log(1+r) \label{log}\end{equation}
and for gamma transform is:
\begin{equation}s=cr^\gamma \label{gamma}\end{equation}
To stretch the contrast of an image, we stretch certain gray values \(r_\text{a}\) and \(r_\text{b}>r_\text{a}\) towards the minimum \(r_\text{min}\) and maximum \(r_\text{max}\) gray values present in the image, and let all resulting \(s>255\) be \(255\) and \(s<0\) be \(0\) [2]. In expression form, this is given by
\begin{equation}s=(r-r_\text{min})\left(\dfrac{r_\text{b}-r_\text{a}}{r_\text{max}-r_\text{min}}\right)+r_\text{a} \label{stretch}\end{equation}
[2].
Further, contrast thresholding can be done by setting a threshold \(t\in[0,255]\) such that
\begin{equation}s=\begin{cases}0 & r<t \\ 255 & r\geq t\end{cases} \label{thresh}\end{equation}
[1].
Lastly, to do intensity slicing, one can do two things: (1) to highlight certain gray values while diminishing that of other values, or (2) highlight certain gray values while retaining that of other values [1]. An expression for (1) could be:
\begin{equation}s=\begin{cases}a & r\in[r_\text{a}, r_\text{b}] \\ b & \text{otherwise}\end{cases}\label{intensity}\end{equation}
where \(a<b\), whereas an expression for (2) could be:
\begin{equation}s=\begin{cases}a & r\in[r_\text{a}, r_\text{b}] \\ r & \text{otherwise}\end{cases}\end{equation}
[1].
In my case, I did all these on the image given in Figure 1.
Figure 1: Image to undergo various point transformations |
Applying \eqref{nega} on Figure 1, we get Figure 2.
Figure 2: Image negative of Figure 1 |
Notice that all the dark regions in Figure 1 became lighter in Figure 2 and vice versa. This makes sense since given \(255\) corresponds to pure white and \(0\) pure black, then \(L-r=255-r\) gives a high gray value when \(r\) is low, i.e. dark and a low gray value when \(r\) is high, i.e. light.
To obtain the logarithmic transform of Figure 1, I manipulated \eqref{log} such that \(c\) would scale the gray values \(\log(1+r)\) such that \(s\in[0,255]\). Searching in the net, I found that \(c\) should be equal to \(c=255/\log(1+r_\text{max})\). With this in mind, and obtaining the logarithmic transform we get Figure 3.
Figure 3: Logarithmic Transform of Figure 1 |
It is interesting to note that Figure 3 appears to be a saturated version of Figure 1, i.e. it is brighter, which made me curious as to why it was so. Plotting \eqref{log}, we get Figure 4.
Figure 4: Plot of \eqref{log} |
From it, we see that Figure 3 do make sense since all initial gray values \(r\) are mapped to higher gray values \(s\).
Again, I modified the equation for gamma transform \eqref{gamma} such that \(r\) in \eqref{gamma} becomes \(r/r_\text{max}\) so that \((r/r_\text{max})^\gamma\in[0,1]\) for all \(r\) and \(\gamma\), thus to bring it back to \([0,255]\), \(c=255\) and \eqref{gamma} becomes
\begin{equation}s=255\left(\dfrac{r}{r_\text{max}}\right)^\gamma \label{modgamma}\end{equation}
Applying this to Figure 1 for differing values of \(\gamma\), we get Figure 5.
Figure 5: Gamma transform of Figure 1 for differing \(\gamma\) |
Notice that for \(\gamma<1\), the resulting image is lighter while those where \(\gamma>1\), the resulting image is darker. This makes more sense if we plot \eqref{modgamma} for the \(\gamma\) used in Figure 5. Doing so we get Figure 6.
Figure 6: Plot of \eqref{modgamma} for the \(\gamma\) used in Figure 5 |
From it we observe why resulting images are lighter when \(\gamma<1\) while they are darker when \(\gamma>1\) - because of how they transform \(r\rightarrow s\).
To implement contrast stretching on Figure 1, I used \eqref{stretch} with \(r_\text{b}\) and \(r_\text{a}\) determined by me. It didn't make sense for me to do this on my current image however since it appeared to contain all possible gray levels, but I did it anyway in hopes of getting interesting results.
So, I experimented with different values of \(r_\text{b}\) and \(r_\text{a}\) which were pairs when you divided \(255\) by \(5\), i.e. \(51\), \(102\), \(153\), and \(204\). I did \((51,204)\) and \((102,153)\) pairs, and upon doing so got this:
Figure 7: Images with stretched contrast with different \(r_\text{a}\) and \(r_\text{b}\). |
Notice that as expected, visually there is not much difference between this and Figure 1. However if we inspect the probability distribution functions (PDF) of these images as in Figure 8,
Figure 8: The original image, the contrast stretched images and their corresponding PDFs. |
we see that there has been a change in the range of gray values the image has. Plotting \eqref{stretch} with the corresponding pairs we get Figure 9:
Figure 9: Plot of \eqref{stretch} using the pairs used in Figures 7 and 8. |
Here we see that for the pairs used in contrast stretching, the corresponding range of values \(s\) takes up indeed becomes lesser and so in our case, instead of stretching the contrast, it was decreased - weird.
We now move to contrast thresholding. In this, I simply applied \eqref{thresh} for different values of \(t\). Doing so I got Figure 10.
Figure 10: Thresholded Figure 1 for different thresholds \(t\) |
It is interesting to note that for \(t=51\), the boundary between light and dark region became very visible and while \(t\) increased, more and more boundaries between light and dark regions became visible to the point that when \(t=204\), what is left light is the region at which most light is coming from. This suggests that contrast thresholding may be used to know edges between light and dark regions, and know where light is coming from.
To implement intensity-level slicing, I used \eqref{intensity} with \(a=150\) and \(b=25\) for different pairs of \(r_\text{a}\) and \(r_\text{b}\), which is essentially the same pairs I used in contrast stretching. Doing so, I got Figure 11:
Figure 11: Intensity-level sliced Figure 1 for different \(r_\text{a}\) and \(r_\text{b}\) pairs. |
Looking at Figure 11, we see that when \((r_\text{a},r_\text{b})=(51,204)\), the really dark and light parts of the image are highlighted, whereas when \((r_\text{a},r_\text{b})=(102,153)\), that, including some of the other regions in what was white earlier is now black.
We were then told to do histogram equalization, a process which you map the cumulative distribution function (CDF) of your image's gray values to another CDF which you want. Essentially it is finding the gray value corresponding to where the two CDF's have the same or closest value. To do this, I essentially used the code shown below:
im_matrix2 = []
for y in range(im_matrix.shape[0]):
im_matrix2.append([])
for x in range(im_matrix.shape[1]):
im_matrix2[y].append(np.where(abs(cdf_desired-cdf[im_matrix[y,x]])==abs(cdf[im_matrix[y,x]]-cdf_desired).min())[0][0])
im_matrix2 = np.array(im_matrix2, dtype=int)
Here I just did what I said in code form, and since both CDF's have length \(256\) with indices ranging from \(0\) to \(255\), the index where the desired CDF and CDF of given gray value already gave me the corresponding gray value - nice!
Now, I did this for both linear and quadratic CDFs. And to compare and contrast the input from the output, I showed the pictures together with their PDFs and CDFs. Doing so we get Figure 12.
Figure 12: The input and resulting images after histogram equalization using different CDF, together with PDF and CDF |
Notice that the input image appear dark, so as shown in its PDF, most gray values are low, having two distinct peaks, and the CDF appear rapidly changing in some regions while being almost uniform in others. However after applying histogram equalization with a desired CDF being linear, we get Figure 12d. Notice that now it appears brighter with the pots now visible. Note also that it appears smooth and aesthetically pleasing (at least in my opinion). Looking at its PDF however, we noticed that the peaks moved to higher gray values and that more pixels have gray values from 200 to 255, which makes sense because visually the image is brighter. Note also that the PDF has gaps where other gray values are not present - this is because of the nature of the desired PDF so that in those regions, the resulting gray values became much higher. And as expected, the CDF of this image is linear.
If we do histogram equalization on the same image only using quadratic CDF however, we notice that the resulting image is much brighter and the peaks in the PDF have been moved to even higher gray values with more spread and bigger gray values not present in the image, which again may be attributed to the nature of the desired CDF which is quadratic, which is seen to be the image's CDF after equalization. Aesthetically speaking however, the image looks odd with light intensity not looking normalized, but rather saturated, so in my opinion, this suggests that since our eyesight is also nonlinear, we may be able to distinguish dark objects from really dark objects to a certain extent, however we cannot distinguish light objects from very light objects.
References:
[1] R. C. Gonzales, and R. E. Woods. Digital image processing (2nd), Chapter 3 (Prentice-Hall, Inc, New Jersey, 2002)
[2] https://homepages.inf.ed.ac.uk/rbf/HIPR2/stretch.htm
[3] https://homepages.inf.ed.ac.uk/rbf/HIPR2/pixlog.htm
Comments
Post a Comment