Thanks to theidioms.com

The Pooling Operation

Convolutional Neural Network Theoretical Course
Tutorial

The Pooling Operation

The Pooling OperationThe Pooling Operation

The pooling operation is another fundamental operation of a Convolutional Neural Network. Thankfully, this operation can be understood much quicker since we already have a sound knowledge of the convolution operation.

In practice, two kinds of pooling operations are mostly used: max pooling and average pooling. We will illustrate both of these pooling operations in the sections below.


Max-pooling

The max-pooling operation takes in a tensor as an input and outputs the maximum element present in the tensor. This can be better understood using the following notation-based example:

Consider an image tensor A with a dimension size of 4×4,

    \[A = \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix}\]

Taking a max-pooling of size 2×2 and stride of 2, the output tensor can be obtained as follows:

    \[o_1 = \text{max}\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}\ ,\ o_2 = \text{max}\begin{pmatrix} a_{13} & a_{14} \\ a_{23} & a_{24} \end{pmatrix}\ ,\ o_3 = \text{max}\begin{pmatrix} a_{31} & a_{32} \\ a_{41} & a_{42} \end{pmatrix}\ ,\ o_4 = \text{max}\begin{pmatrix} a_{33} & a_{34} \\ a_{43} & a_{44} \end{pmatrix}\]

The final output tensor is then obtained as follows,

    \[O = \begin{pmatrix} o_1 & o_2 \\ o_3 & o_4 \end{pmatrix}\]

Let us understand this even clearly with the help of a numerical example.

Consider an image tensor A with a dimension size of 4×4,

    \[A = \begin{pmatrix} 2 & 4 & 6 & 8 \\ 10 & 12 & 14 & 16 \\ 18 & 20 & 22 & 24 \\ 26 & 28 & 30 & 32 \end{pmatrix}\]

Taking a max-pooling of size 2×2 and stride of 2, the output tensor can be obtained as follows,

    \[o_1 = \text{max}\begin{pmatrix} 2 & 4 \\ 10 & 12 \end{pmatrix} = 12 \ ,\ o_2 = \text{max}\begin{pmatrix} 6 & 8 \\ 14 & 16 \end{pmatrix} = 16 \ ,\ o_3 = \text{max}\begin{pmatrix} 18 & 20 \\ 26 & 28 \end{pmatrix} = 28 \ ,\ o_4 = \text{max}\begin{pmatrix} 22 & 24 \\30 & 32 \end{pmatrix} = 32\]

The final output tensor is then obtained as follows,

    \[O = \begin{pmatrix} 12 & 16 \\ 28 & 32 \end{pmatrix}\]


Average-pooling

The average-pooling operation takes in a tensor as an input and outputs the average of all the elements present in the tensor. This can be better understood using the following notation-based example:

Consider an image tensor A with a dimension size of 4×4,

    \[A = \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix}\]

Taking a average-pooling of size 2×2 and stride of 2, the output tensor can be obtained as follows,

    \[o_1 = \text{avg}\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}\ ,\ o_2 = \text{avg}\begin{pmatrix} a_{13} & a_{14} \\ a_{23} & a_{24} \end{pmatrix}\ ,\ o_3 = \text{avg}\begin{pmatrix} a_{31} & a_{32} \\ a_{41} & a_{42} \end{pmatrix}\ ,\ o_4 = \text{avg}\begin{pmatrix} a_{33} & a_{34} \\ a_{43} & a_{44} \end{pmatrix}\]

The final output tensor is then obtained as follows,

    \[O = \begin{pmatrix} o_1 & o_2 \\ o_3 & o_4 \end{pmatrix}\]

Let us understand this even clearly with the help of a numerical example.

Consider an image tensor A with a dimension size of 4×4,

    \[A = \begin{pmatrix} 2 & 4 & 6 & 8 \\ 10 & 12 & 14 & 16 \\ 18 & 20 & 22 & 24 \\ 26 & 28 & 30 & 32 \end{pmatrix}\]

Taking a max-pooling of size 2×2 and stride of 2, the output tensor can be obtained as follows,

    \[o_1 = \text{avg}\begin{pmatrix} 2 & 4 \\ 10 & 12 \end{pmatrix} = 7 \ ,\ o_2 = \text{avg}\begin{pmatrix} 6 & 8 \\ 14 & 16 \end{pmatrix} = 11 \ ,\ o_3 = \text{avg}\begin{pmatrix} 18 & 20 \\ 26 & 28 \end{pmatrix} = 23 \ ,\ o_4 = \text{avg}\begin{pmatrix} 22 & 24 \\30 & 32 \end{pmatrix} = 27\]

The final output tensor is then obtained as follows,

    \[O = \begin{pmatrix} 7 & 11 \\ 23 & 27 \end{pmatrix}\]

The pooling operation is usually performed after the convolution operation. Pooling is performed in order to further reduce the size of the input tensor by selecting only the important features from an image.


Are you interested in working with data? Get a free trial month of LinkedIn Learning and learn Data Science and Machine Learning courses using Python, R, and SQL from top instructors.

Enroll in any one of the courses from the following fields today on LinkedIn Learning and start your free month of learning:

Disclaimer: When you subscribe to LinkedIn Learning, we may earn a small commission as an affiliate. We love the platform so much so that we have been using it ourselves at The Click Reader.

The Pooling OperationThe Pooling Operation

Leave your thought here

Your email address will not be published. Required fields are marked *