The Pooling Operation

Table of Contents
Primary Item (H2)
[latexpage]

The pooling operation is another fundamental operation of a Convolutional Neural Network. Thankfully, this operation can be understood much quicker since we already have a sound knowledge of the convolution operation.

In practice, two kinds of pooling operations are mostly used: max pooling and average pooling. We will illustrate both of these pooling operations in the sections below.


Max-pooling

The max-pooling operation takes in a tensor as an input and outputs the maximum element present in the tensor. This can be better understood using the following notation-based example:

Consider an image tensor $A$ with a dimension size of 4x4,

$$ A = \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix}$$

Taking a max-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows:

$$o_1 = \text{max}\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}\ ,\ o_2 = \text{max}\begin{pmatrix} a_{13} & a_{14} \\ a_{23} & a_{24} \end{pmatrix}\ ,\ o_3 = \text{max}\begin{pmatrix} a_{31} & a_{32} \\ a_{41} & a_{42} \end{pmatrix}\ ,\ o_4 = \text{max}\begin{pmatrix} a_{33} & a_{34} \\ a_{43} & a_{44} \end{pmatrix}$$

The final output tensor is then obtained as follows,

$$O = \begin{pmatrix} o_1 & o_2 \\ o_3 & o_4 \end{pmatrix} $$

Let us understand this even clearly with the help of a numerical example.

Consider an image tensor $A$ with a dimension size of 4x4,

$$ A = \begin{pmatrix} 2 & 4 & 6 & 8 \\ 10 & 12 & 14 & 16 \\ 18 & 20 & 22 & 24 \\ 26 & 28 & 30 & 32 \end{pmatrix}$$

Taking a max-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows,

$$o_1 = \text{max}\begin{pmatrix} 2 & 4 \\ 10 & 12 \end{pmatrix} = 12 \ ,\ o_2 = \text{max}\begin{pmatrix} 6 & 8 \\ 14 & 16 \end{pmatrix} = 16 \ ,\ o_3 = \text{max}\begin{pmatrix} 18 & 20 \\ 26 & 28 \end{pmatrix} = 28 \ ,\ o_4 = \text{max}\begin{pmatrix} 22 & 24 \\30 & 32 \end{pmatrix} = 32$$

The final output tensor is then obtained as follows,

$$O = \begin{pmatrix} 12 & 16 \\ 28 & 32 \end{pmatrix} $$


Average-pooling

The average-pooling operation takes in a tensor as an input and outputs the average of all the elements present in the tensor. This can be better understood using the following notation-based example:

Consider an image tensor $A$ with a dimension size of 4x4,

$$ A = \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix}$$

Taking a average-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows,

$$o_1 = \text{avg}\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}\ ,\ o_2 = \text{avg}\begin{pmatrix} a_{13} & a_{14} \\ a_{23} & a_{24} \end{pmatrix}\ ,\ o_3 = \text{avg}\begin{pmatrix} a_{31} & a_{32} \\ a_{41} & a_{42} \end{pmatrix}\ ,\ o_4 = \text{avg}\begin{pmatrix} a_{33} & a_{34} \\ a_{43} & a_{44} \end{pmatrix}$$

The final output tensor is then obtained as follows,

$$O = \begin{pmatrix} o_1 & o_2 \\ o_3 & o_4 \end{pmatrix} $$

Let us understand this even clearly with the help of a numerical example.

Consider an image tensor $A$ with a dimension size of 4x4,

$$ A = \begin{pmatrix} 2 & 4 & 6 & 8 \\ 10 & 12 & 14 & 16 \\ 18 & 20 & 22 & 24 \\ 26 & 28 & 30 & 32 \end{pmatrix}$$

Taking a max-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows,

$$o_1 = \text{avg}\begin{pmatrix} 2 & 4 \\ 10 & 12 \end{pmatrix} = 7 \ ,\ o_2 = \text{avg}\begin{pmatrix} 6 & 8 \\ 14 & 16 \end{pmatrix} = 11 \ ,\ o_3 = \text{avg}\begin{pmatrix} 18 & 20 \\ 26 & 28 \end{pmatrix} = 23 \ ,\ o_4 = \text{avg}\begin{pmatrix} 22 & 24 \\30 & 32 \end{pmatrix} = 27$$

The final output tensor is then obtained as follows,

$$O = \begin{pmatrix} 7 & 11 \\ 23 & 27 \end{pmatrix} $$

The pooling operation is usually performed after the convolution operation. Pooling is performed in order to further reduce the size of the input tensor by selecting only the important features from an image.


The Pooling OperationThe Pooling Operation

Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:

  1. Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
  2. Introduction to Data Science  in Python- 400,000+ students already enrolled!
  3. Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
  4. Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!
Written by
The Click Reader
At The Click Reader, we are committed to empowering individuals with the tools and knowledge needed to excel in the ever-evolving field of data science. Our sole focus is delivering a world-class data science bootcamp that transforms beginners and upskillers into industry-ready professionals.

Interested In Data Science Bootcamp?
Request more info now.

Lead Collection Form
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram