First taste at Image Processing

Last August 15, 2018, we had our first taste at image processing. While we didn't do anything hardcore, it was still fun to do since this was new to us, so the possibilities were endless.

The first thing we were told to do was to create 30x30 images, either manually or through programming them. It wasn't really straightforward doing both since we were making .pgm images, a not-so-familiar image file-type that can be edited easily using Paint. A .pgm, or portable gray map image [1], is a very simple image file-type that encodes images using gray values on a scale from 0 to 255 where 0 represents black, and 255 white. Essentially, when encoding or editing .pgm files, all you need to do is to edit the gray scale values of each pixel in the image, and you get your result after saving it. How easy!

In our case, we were tasked to create seven images, a dot, outline of a square, cross, filled circle, concentric squares of different gray values, and the first letter of my name which is 'C'.

Originally, I was thinking of just making a .png image, convert it to gray scale, and put that into the .pgm file since I already had a code that takes an image and tells me the gray scale value of each pixel in the image. However, I thought that to be boring and I wouldn't learn, so I decided to hard code everything. It wasn't all that hard either.

All I had to do was to create an empty 30x30 array and modify the values of some points in that array to outline the shape that I needed to make. It was pretty simple for the dot and square since I just had to label a random point on the array, and build a square from it with a certain side length. It became a bit challenging after it though.

But after a little thinking, I remembered that in order to create a cross, all I had to do was to do the same thing as I did with the square, but only modify the values of the points corresponding to the middle part of the square, top to bottom, and left to right, and surely, I was able to make it! The circle felt challenging to at the start, but I remembered one important equation - the equation of the circle, given by
\begin{equation}\label{circle} (x-x_0)^2+(y-y_0)^2=r^2\end{equation}
Hence all I had to do was to choose an origin \((x_0,y_0)\), and radius \(r\), and then just modify the values of the points along the array that satisfied the condition in \eqref{circle}.

I was beginning to feel lazy at this point because I was thinking hard to get the shapes that we needed to make, so to make the smiley, I just made a circle, and drew three lines signifying the eyes and mouth depicting how I felt at that time. Haha!

Making the concentric squares and letter 'C' was the most fun in my opinion as I was always anticipating how it looked like in the end. To make concentric squares of differing gray scale values, all I had to do was to make a series of squares having different gray scale values from the rest. I thought it would be nice if the gray scale values spanned from 0 to 255 - so in the end I had a series of squares looking white to black. Making the letter 'C' was also fun because all I had to do was to do concentric circles where I changed the condition such that besides being a circle, it would only modify the values of the points that are smaller than a certain \(x_0\) value, which made this look cool.

All the resulting .pgm images are shown in Figure 1, although I had to convert them to .png so that I can place them here in the blog.

Figure 1: The synthetic images I created by programming them in Python

To be able to convert the 30x30 matrix I made to a .pgm file though, I needed a program to do it for me. Luckily Python had what I needed to do just that. Knowing what was contained in a typical plain .pgm file, all I had to do was to print all the necessary information, such as the magic character of P2, a commented form of the .pgm's filename, the dimensions of the matrix (which is also the image's dimensions), the maximum value found in the matrix, and the matrix itself without the brackets. After coding this, I have the program shown below.
def matrix_to_pgm(matrix, filename):
    file = open(str(filename)+'.pgm', 'w')
    file.write('P2\n# '+str(filename)+'.pgm\n'+str(matrix.shape[1])+' '+str(matrix.shape[0])+'\n'+str(int(np.amax(matrix)))+'\n')

    for ax in range(matrix.shape[0]):
        for ay in range(matrix.shape[1]):
            file.write(str(int(matrix[ax, ay]))+' ')
        if ax < matrix.shape[0]-1: file.write('\n')
    file.close()
We were then tasked to take any image, scale it down so that it wouldn't be very big, convert to gray scale, and convert it to a .pgm file. Fortunately all I needed to do was to use the code I made to convert 2D matrices to .pgm files, and got the result shown in Figure 2. It was pretty straightforward.

Figure 2: The process of converting the input image to a .pgm file.
Since there was a code to convert 2D matrices to .gm files, it only seemed fitting to create a code that can do the reverse - convert a .pgm file into a matrix. It isn't that straightforward as the matrix in the .pgm file didn't contain brackets so a straightforward conversion from the .pgm file to a matrix in Python wouldn't work. Hence to work around this, I taught the program to read numerical strings before placing them into a matrix. All it did was to check if the character it was reading in the .pgm file was a number, then if it encounters a space, or reached the end of the line with the current string having digits, then it would place that value in the matrix and read characters until the same condition was met.

I had a lot of problems doing this at first because I didn't know how to read the text in files using Python, so I had to do a little bit of research to know it, and gladly I was able to do it. The resulting code is shown below.
def pgm_to_matrix(filename):
    file = open(str(filename)+'.pgm', 'r')
    line = [l.strip() for l in file]
    matrix = []
    for l in range(len(line)):
        if l >= 4:
            matrix.append([])
            s = ""
            for c in range(len(line[l])):
                if line[l][c].isdigit():
                    s += line[l][c]
                if s != "" and line[l][c] == ' ' or c == len(line[l])-1:
                    matrix[l-4].append(int(s))
                    s = ""
    return np.asarray(matrix)

And to prove that it works, I show you Figure 3, the example of it working. It was satisfying once I got the program to work since it was grueling to do so at first.

Figure 3: Process of converting a .pgm file to a 2D matrix.

Next, we were told to manipulate the gray values of an image, either by adding or multiplying a constant to it. In Figure 4, I show the original image (a), and the result of adding and multiplying constants to the gray values of the image (b) and (c) respectively. Interestingly, all it did was to make the image much whiter, which makes sense since adding or multiplying any number by a constant only makes it greater in magnitude, hence the closer it gets to 255 (the maximum value), then it will be pure white.

We were then told to add and subtract two of the images we created in Figure 1. For this case, I decided to add and subtract the image of the cross and concentric squares as it, more or less, depicts everything that could happen if you add or subtracted images. When I did add the two, I noticed two things, the resulting images were essentially the superposition of the two images, so the parts of the images where the two had nonzero values changed after adding or subtracting them, and we see that when we added the two images, as in Figure 4d, their intersection became white, while subtracting them made their intersection black, as in Figure 4e, as we would expect.

Figure 4: Result of adding and multiplying a constant to the gray scale values of the input image.
Also, the result of the addition and subtraction of two different images, in this case, a cross
and picture of concentric squares with varying gray scale values.

Lastly, we were told to create a histogram of the gray values found in our image. Doing such for my image in Figure 5a, we get Figure 5b. Notice that the histogram doesn't show any general trend, rather shows the number of pixels having certain gray values - one which you'd expect from a histogram. One nice thing to note is that while the image appears to have a lot of black spots, the actual peak in the histogram is at gray value of 255, corresponding to white, while there is a spread in peak values for low gray values ranging from 0 to 50. This suggests that while most of the input image appears black, this kind of black is not pure black where gray value is 0, and instead is mostly white. And it sort of makes sense since looking at the lower left hand corner of the input image in Figure 5a, we see a big patch of pure white.

Figure 5: The input image and the corresponding histogram of gray values found in the image.
That was a pretty long ramble about what was going inside my head when I was doing all these activities, however I don't know if I can sustain such a feat all throughout doing these. Certainly the succeeding posts might become shorter, however I'll try my best to put as much insightful information as to how these things were coming up in my head when while doing such.

References:
[1] http://netpbm.sourceforge.net/doc/pgm.html

Comments

Popular posts from this blog

First Post in this Blog

Edge Detection and Image Segmentation

Analyzing Images in Fourier Space