The learning process of a deep neural network is also very similar to the learning process of a neuron. It goes through multiple finite iterations where data is feed-forwarded into the network and then, weights are adjusted using back-propagation (gradient descent).
First of all, the input data is fed into the network through the input layer and all inputs are assigned with random weights at the start of training.
Then, the inputs, along with their respective weights, are passed into the hidden layers where each neuron applies an activation function on the linear combination of its inputs. Finally, the output layer outputs the probability of each label in the training set and loss is calculated using a loss function $J(w)$.
After the loss is calculated, the deep neural network back-propagates and gradient descent is performed to change the values of weights of all inputs for all neurons in the network to decrease the output loss. The gradient descent algorithm is still the same as we had studied before:
$$ (\textbf{w}^{(layer)}_{neuron})_{n+1} = (\textbf{w}^{(layer)}_{neuron})_n - \alpha\nabla J((\textbf{w}^{(layer)}_{neuron})_n)$$
This entire process is iterated multiple times until the loss of the neural network reaches a certain threshold. Once the model reaches the threshold, training is stopped.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in: