Convolutional Neural Network Theoretical Course (Course VIII)July 17, 2020 2020-08-04 10:54
Convolutional Neural Network Theoretical Course (Course VIII)
- Welcome to Course VIII!
- Introduction to Convolutional Neural Networks
- The Convolution Operation
- Stride and Calculation of Output Size
- The Pooling Operation
- The Convolution/Pooling Operation for RGB images
- Padding an Image
- Building a Convolutional Neural Network
- Training a Convolutional Neural Network
- End of Course
Stride and Calculation of Output Size
In the last lesson, we discussed shifting the convolution filter (kernel) by one pixel at a time, i.e., by a stride of 1. Well, it is not necessary to move a convolution filter across an image by just taking a stride of 1.
The stride () taken during a series of convolution operation can be changed according to the need of the problem. When = 1, it means that the filter will be shifted by a step of one column of pixel values to the right or one row of pixel values to the bottom. Similarly, when = 2, it means that the filter will be shifted by a step of two columns of pixel values to the right or two rows of pixel values to the bottom and so on.
However, if you think for a moment, why would anyone want to take a larger stride since the neural network will be ignoring pixel values during computation. There are multiple reasons but here are some major ones:
- Taking a larger stride allows a series of convolution operations to be computed faster for a large dimension image (say, 3000×3000 pixels).
- Lesser memory is needed to store the results of the convolution operation.
- The size of the output tensor can be reduced to make the input to the next layer of a Convolutional Neural Network smaller.
- Since overlapping pixel values are ignored when selecting new regions, overfitting can be avoided.
Finding the size of an output tensor after a series of convolution operations
Generally, in a Convolutional Neural Network, the input image undergoes multiple convolution operations, where each convolution operation might change the size of the input image. In this section you will learn an easy way to find the size on an output tensor after a series of convolution operations.
If x is the size of the input image tensor, x is the size of the convolution filter and is the value of stride taken, then, the size of the resulting tensor, x (after a series of convolution operation) can be found out using the following formula:
where, means that a floating-point result is rounded to its closest smallest integer value.
We’ve noticed that in the first lesson, we had started with an image tensor of size 4×4 and a stride of 1. However, after performing a series of convolutions, the output tensor got reduced to a size of 2×2. Let us see if the above formula can show similar results for a filter size of 3×3,
Thus, the size of the output tensor is 2 x 2.
With this, you now know about the Convolution Operation in CNNs. In the next chapter, you will be introduced to another important operation in CNNs, the Pooling Operation.