In some cases, it is not possible to perform a convolution/pooling operation on an image if the dimension of the image is smaller than the dimension of the filter region. Thus, to fix this problem, we can pad such images with rows and columns of pixel values to form an image tensor. There are different ways to choose the values of the padded pixels but we mostly use '0' or the value of the closest pixel.
Here is a simple example demonstrating the concept of padding an image with zeroes. Consider an image tensor $A$ with a dimension of 2x2 as shown on the left side of the image below. Since this image only has two columns of pixel values, we cannot use a 3x3 filter on it. So, we can pad the image with zeroes to make 3x3 convolution/pooling operation possible. Padding can be done as shown on the right side of the image below,
Now, we have a 4x4 image tensor and the 3x3 convolution/pooling operation can be performed. Here, the value of padding is 1 since we padded the image once on the top, left, right, and bottom.
If $n_{A1}$ x $n_{A2}$ is the size of the input image tensor, $n_K$ x $n_K$ is the size of the convolution filter, $s$ is the value of stride taken and $p$ is the amount of padding, then, the size of the resulting tensor, $n_{O1}$ x $n_{O2}$ (after a series of convolution operation) can be found out using the following formula:
$$n_{O1} = \text{floor}\begin{pmatrix} \dfrac{n_{A1}+2p-n_{K}}{s} + 1 \end{pmatrix}$$
and,
$$n_{O2} = \text{floor}\begin{pmatrix} \dfrac{n_{A2}+2p-n_{K}}{s} + 1 \end{pmatrix}$$
Calculating the output tensor size when a filter size of 3x3, stride of 1 and a padding of 1 is used on a 4x2 image.
$$n_{O1} = \text{floor}\begin{pmatrix} \dfrac{4+2-3}{1} + 1 \end{pmatrix} = 4$$
and,
$$n_{O2} = \text{floor}\begin{pmatrix} \dfrac{2+2-3}{1} + 1 \end{pmatrix} = 2$$
Thus, the size of the output tensor is 4 x 2.
Note: Performing a convolution/pooling operation decreases the size of the input tensor but if the right padding value is chosen then the original size can be retained.
With this, you now have all the fundamental knowledge required to build a Convolutional Neural Network. In the next chapter, we will be tying up everything we have learned until now to build a Convolutional Neural Network.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in: