Loss Functions in Deep Learning

In machine learning, the loss function is a method used to measure how far the model’s predictions are from the target. The closer the model’s predictions are to the target, the smaller the loss function.

When training a network, the network makes its predictions and estimates the loss. This loss is then used to update the weights of the network with the intent to get a smaller loss. This step is repeated for a specified number of iterations or until the loss does not improve anymore.

In this blog post, I am going to focus on the more commonly used loss functions in deep learning.

The loss function to be used depends on the type of problem that it is being solved.

Regression Problems

The most commonly used loss function for regression problems is the mean squared error loss (MSE). The MSE, shown in the equation below, is the average of the sum of the squared differences between the model’s predicted and the target values.

Classification Problems

The most commonly used loss function for classification problems is the cross-entropy loss, which is also known as logarithmic loss, logistic loss, or log loss. In a given set of events and probabilities, the cross-entropy represents the likelihood of these events happening based on those probabilities. A small cross-entropy means that the likelihood is very likely, while a large cross-entropy means that it is unlikely.

In the above equation, y represents the actual label while p represents the prediction.