The pooling operation is another fundamental operation of a Convolutional Neural Network. Thankfully, this operation can be understood much quicker since we already have a sound knowledge of the convolution operation.
In practice, two kinds of pooling operations are mostly used: max pooling and average pooling. We will illustrate both of these pooling operations in the sections below.
The max-pooling operation takes in a tensor as an input and outputs the maximum element present in the tensor. This can be better understood using the following notation-based example:
Consider an image tensor $A$ with a dimension size of 4x4,
$$ A = \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix}$$
Taking a max-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows:
$$o_1 = \text{max}\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}\ ,\ o_2 = \text{max}\begin{pmatrix} a_{13} & a_{14} \\ a_{23} & a_{24} \end{pmatrix}\ ,\ o_3 = \text{max}\begin{pmatrix} a_{31} & a_{32} \\ a_{41} & a_{42} \end{pmatrix}\ ,\ o_4 = \text{max}\begin{pmatrix} a_{33} & a_{34} \\ a_{43} & a_{44} \end{pmatrix}$$
The final output tensor is then obtained as follows,
$$O = \begin{pmatrix} o_1 & o_2 \\ o_3 & o_4 \end{pmatrix} $$
Let us understand this even clearly with the help of a numerical example.
Consider an image tensor $A$ with a dimension size of 4x4,
$$ A = \begin{pmatrix} 2 & 4 & 6 & 8 \\ 10 & 12 & 14 & 16 \\ 18 & 20 & 22 & 24 \\ 26 & 28 & 30 & 32 \end{pmatrix}$$
Taking a max-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows,
$$o_1 = \text{max}\begin{pmatrix} 2 & 4 \\ 10 & 12 \end{pmatrix} = 12 \ ,\ o_2 = \text{max}\begin{pmatrix} 6 & 8 \\ 14 & 16 \end{pmatrix} = 16 \ ,\ o_3 = \text{max}\begin{pmatrix} 18 & 20 \\ 26 & 28 \end{pmatrix} = 28 \ ,\ o_4 = \text{max}\begin{pmatrix} 22 & 24 \\30 & 32 \end{pmatrix} = 32$$
The final output tensor is then obtained as follows,
$$O = \begin{pmatrix} 12 & 16 \\ 28 & 32 \end{pmatrix} $$
The average-pooling operation takes in a tensor as an input and outputs the average of all the elements present in the tensor. This can be better understood using the following notation-based example:
Consider an image tensor $A$ with a dimension size of 4x4,
$$ A = \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix}$$
Taking a average-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows,
$$o_1 = \text{avg}\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}\ ,\ o_2 = \text{avg}\begin{pmatrix} a_{13} & a_{14} \\ a_{23} & a_{24} \end{pmatrix}\ ,\ o_3 = \text{avg}\begin{pmatrix} a_{31} & a_{32} \\ a_{41} & a_{42} \end{pmatrix}\ ,\ o_4 = \text{avg}\begin{pmatrix} a_{33} & a_{34} \\ a_{43} & a_{44} \end{pmatrix}$$
The final output tensor is then obtained as follows,
$$O = \begin{pmatrix} o_1 & o_2 \\ o_3 & o_4 \end{pmatrix} $$
Let us understand this even clearly with the help of a numerical example.
Consider an image tensor $A$ with a dimension size of 4x4,
$$ A = \begin{pmatrix} 2 & 4 & 6 & 8 \\ 10 & 12 & 14 & 16 \\ 18 & 20 & 22 & 24 \\ 26 & 28 & 30 & 32 \end{pmatrix}$$
Taking a max-pooling of size 2x2 and stride of 2, the output tensor can be obtained as follows,
$$o_1 = \text{avg}\begin{pmatrix} 2 & 4 \\ 10 & 12 \end{pmatrix} = 7 \ ,\ o_2 = \text{avg}\begin{pmatrix} 6 & 8 \\ 14 & 16 \end{pmatrix} = 11 \ ,\ o_3 = \text{avg}\begin{pmatrix} 18 & 20 \\ 26 & 28 \end{pmatrix} = 23 \ ,\ o_4 = \text{avg}\begin{pmatrix} 22 & 24 \\30 & 32 \end{pmatrix} = 27$$
The final output tensor is then obtained as follows,
$$O = \begin{pmatrix} 7 & 11 \\ 23 & 27 \end{pmatrix} $$
The pooling operation is usually performed after the convolution operation. Pooling is performed in order to further reduce the size of the input tensor by selecting only the important features from an image.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in: