Padding an ImageNovember 24, 2020 2020-12-03 21:17
Padding an Image
Padding an Image
In some cases, it is not possible to perform a convolution/pooling operation on an image if the dimension of the image is smaller than the dimension of the filter region. Thus, to fix this problem, we can pad such images with rows and columns of pixel values to form an image tensor. There are different ways to choose the values of the padded pixels but we mostly use ‘0’ or the value of the closest pixel.
Here is a simple example demonstrating the concept of padding an image with zeroes. Consider an image tensor with a dimension of 2×2 as shown on the left side of the image below. Since this image only has two columns of pixel values, we cannot use a 3×3 filter on it. So, we can pad the image with zeroes to make 3×3 convolution/pooling operation possible. Padding can be done as shown on the right side of the image below,
Now, we have a 4×4 image tensor and the 3×3 convolution/pooling operation can be performed. Here, the value of padding is 1 since we padded the image once on the top, left, right, and bottom.
Finding the size of an output tensor when padding is used
If x is the size of the input image tensor, x is the size of the convolution filter, is the value of stride taken and is the amount of padding, then, the size of the resulting tensor, x (after a series of convolution operation) can be found out using the following formula:
Calculating the output tensor size when a filter size of 3×3, stride of 1 and a padding of 1 is used on a 4×2 image.
Thus, the size of the output tensor is 4 x 2.
Note: Performing a convolution/pooling operation decreases the size of the input tensor but if the right padding value is chosen then the original size can be retained.
With this, you now have all the fundamental knowledge required to build a Convolutional Neural Network. In the next chapter, we will be tying up everything we have learned until now to build a Convolutional Neural Network.