Stride and Calculation of Output Size

Table of Contents

[latexpage]

In the last lesson, we discussed shifting the convolution filter (kernel) by one pixel at a time, i.e., by a stride of 1. Well, it is not necessary to move a convolution filter across an image by just taking a stride of 1.

Stride is the number of pixels shifts over the input matrix. The stride ($s$) taken during a series of convolution operation can be changed according to the need of the problem. When $s$ = 1, it means that the filter will be shifted by a step of one column of pixel values to the right or one row of pixel values to the bottom. Similarly, when $s$ = 2, it means that the filter will be shifted by a step of two columns of pixel values to the right or two rows of pixel values to the bottom and so on.

However, if you think for a moment, why would anyone want to take a larger stride since the neural network will be ignoring pixel values during computation. There are multiple reasons but here are some major ones:

Taking a larger stride allows a series of convolution operations to be computed faster for a large dimension image (say, 3000x3000 pixels).
Lesser memory is needed to store the results of the convolution operation.
The size of the output tensor can be reduced to make the input to the next layer of a Convolutional Neural Network smaller.
Since overlapping pixel values are ignored when selecting new regions, overfitting can be avoided.

Finding the size of an output tensor after a series of convolution operations

Generally, in a Convolutional Neural Network, the input image undergoes multiple convolution operations, where each convolution operation might change the size of the input image. In this section you will learn an easy way to find the size on an output tensor after a series of convolution operations.

If $n_{A1}$ x $n_{A2}$ is the size of the input image tensor, $n_K$ x $n_K$ is the size of the convolution filter and $s$ is the value of stride taken, then, the size of the resulting tensor, $n_{O1}$ x $n_{O2}$ (after a series of convolution operation) can be found out using the following formula:

$$n_{O1} = \text{floor}\begin{pmatrix} \dfrac{n_{A1}-n_{K}}{s} + 1 \end{pmatrix}$$

and,

$$n_{O2} = \text{floor}\begin{pmatrix} \dfrac{n_{A2}-n_{K}}{s} + 1 \end{pmatrix}$$

where, $\text{floor()}$ means that a floating-point result is rounded to its closest smallest integer value.

We've noticed that in the first lesson, we had started with an image tensor $A$ of size 4x4 and a stride $s$ of 1. However, after performing a series of convolutions, the output tensor $O$ got reduced to a size of 2x2. Let us see if the above formula can show similar results for a filter size of 3x3,

$$n_{O1} = \text{floor}\begin{pmatrix}{\dfrac{4-3}{1} + 1 \end{pmatrix} = 2 $$

and,

$$n_{O2} = \text{floor}\begin{pmatrix}{\dfrac{4-3}{1} + 1 \end{pmatrix} = 2 $$

Thus, the size of the output tensor is 2 x 2.

With this, you now know about the Convolution Operation in CNNs. In the next chapter, you will be introduced to another important operation in CNNs, the Pooling Operation.

Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:

Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
Introduction to Data Science in Python- 400,000+ students already enrolled!
Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!

Written by

The Click Reader

At The Click Reader, we are committed to empowering individuals with the tools and knowledge needed to excel in the ever-evolving field of data science. Our sole focus is delivering a world-class data science bootcamp that transforms beginners and upskillers into industry-ready professionals.

Stride and Calculation of Output Size

Finding the size of an output tensor after a series of convolution operations

Related Articles

Inspecting data using Pandas

Naive Bayes Classifier

Introduction to Convolutional Neural Networks

Natural Language Processing (NLP) for Beginners using NLTK

Interested In Data Science Bootcamp?
Request more info now.

Stride and Calculation of Output Size

Finding the size of an output tensor after a series of convolution operations

Related Articles

Inspecting data using Pandas

Naive Bayes Classifier

Introduction to Convolutional Neural Networks

Natural Language Processing (NLP) for Beginners using NLTK

Interested In Data Science Bootcamp?Request more info now.

Interested In Data Science Bootcamp?
Request more info now.