Padding an Image

Table of Contents

[latexpage]

In some cases, it is not possible to perform a convolution/pooling operation on an image if the dimension of the image is smaller than the dimension of the filter region. Thus, to fix this problem, we can pad such images with rows and columns of pixel values to form an image tensor. There are different ways to choose the values of the padded pixels but we mostly use '0' or the value of the closest pixel.

Here is a simple example demonstrating the concept of padding an image with zeroes. Consider an image tensor $A$ with a dimension of 2x2 as shown on the left side of the image below. Since this image only has two columns of pixel values, we cannot use a 3x3 filter on it. So, we can pad the image with zeroes to make 3x3 convolution/pooling operation possible. Padding can be done as shown on the right side of the image below,

Now, we have a 4x4 image tensor and the 3x3 convolution/pooling operation can be performed. Here, the value of padding is 1 since we padded the image once on the top, left, right, and bottom.

Finding the size of an output tensor when padding is used

If $n_{A1}$ x $n_{A2}$ is the size of the input image tensor, $n_K$ x $n_K$ is the size of the convolution filter, $s$ is the value of stride taken and $p$ is the amount of padding, then, the size of the resulting tensor, $n_{O1}$ x $n_{O2}$ (after a series of convolution operation) can be found out using the following formula:

$$n_{O1} = \text{floor}\begin{pmatrix} \dfrac{n_{A1}+2p-n_{K}}{s} + 1 \end{pmatrix}$$

and,

$$n_{O2} = \text{floor}\begin{pmatrix} \dfrac{n_{A2}+2p-n_{K}}{s} + 1 \end{pmatrix}$$

Calculating the output tensor size when a filter size of 3x3, stride of 1 and a padding of 1 is used on a 4x2 image.

$$n_{O1} = \text{floor}\begin{pmatrix} \dfrac{4+2-3}{1} + 1 \end{pmatrix} = 4$$

and,

$$n_{O2} = \text{floor}\begin{pmatrix} \dfrac{2+2-3}{1} + 1 \end{pmatrix} = 2$$

Thus, the size of the output tensor is 4 x 2.

Note: Performing a convolution/pooling operation decreases the size of the input tensor but if the right padding value is chosen then the original size can be retained.

With this, you now have all the fundamental knowledge required to build a Convolutional Neural Network. In the next chapter, we will be tying up everything we have learned until now to build a Convolutional Neural Network.

Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:

Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
Introduction to Data Science in Python- 400,000+ students already enrolled!
Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!

Written by

The Click Reader

At The Click Reader, we are committed to empowering individuals with the tools and knowledge needed to excel in the ever-evolving field of data science. Our sole focus is delivering a world-class data science bootcamp that transforms beginners and upskillers into industry-ready professionals.

Padding an Image

Finding the size of an output tensor when padding is used

Related Articles

Decision Tree Classifier

Multiple Linear Regression

Python libraries for Machine Learning

Introduction - Convolutional Neural Network Theoretical Course

Interested In Data Science Bootcamp?
Request more info now.

Padding an Image

Finding the size of an output tensor when padding is used

Related Articles

Decision Tree Classifier

Multiple Linear Regression

Python libraries for Machine Learning

Introduction - Convolutional Neural Network Theoretical Course

Interested In Data Science Bootcamp?Request more info now.

Interested In Data Science Bootcamp?
Request more info now.