The learning process of a deep neural network is also very similar to the learning process of a neuron. It goes through multiple finite iterations where data is feed-forwarded into the network and then, weights are adjusted using back-propagation (gradient descent).
First of all, the input data is fed into the network through the input layer and all inputs are assigned with random weights at the start of training.
Then, the inputs, along with their respective weights, are passed into the hidden layers where each neuron applies an activation function on the linear combination of its inputs. Finally, the output layer outputs the probability of each label in the training set and loss is calculated using a loss function $J(w)$.
After the loss is calculated, the deep neural network back-propagates and gradient descent is performed to change the values of weights of all inputs for all neurons in the network to decrease the output loss. The gradient descent algorithm is still the same as we had studied before:
$$ (\textbf{w}^{(layer)}_{neuron})_{n+1} = (\textbf{w}^{(layer)}_{neuron})_n – \alpha\nabla J((\textbf{w}^{(layer)}_{neuron})_n)$$
This entire process is iterated multiple times until the loss of the neural network reaches a certain threshold. Once the model reaches the threshold, training is stopped.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:
- Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
- Introduction to Data Science in Python- 400,000+ students already enrolled!
- Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
- Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!