The Mean Squared Error (MSE) loss is an adaptation of the Quadratic Loss (L2) taking into account the used number of input-output pairs.
\( L \) | This is the symbol for a loss function. It is a function that calculates how wrong a model's inference is compared to where it should be. |
\( h \) | This symbol denotes a model in machine learning. |
\( y \) | This symbol stands for the ground truth of a sample. In supervised learning this is often paired with the corresponding input. |
\( u \) | This symbol denotes the input of a model. |
MSE is a loss function, so it takes the form:
\[\htmlClass{sdt-0000000072}{L} : \htmlClass{sdt-0000000045}{\mathbb{R}}^{\htmlClass{sdt-0000000117}{n}} \times \htmlClass{sdt-0000000045}{\mathbb{R}}^{\htmlClass{sdt-0000000117}{n}} \rightarrow \htmlClass{sdt-0000000045}{\mathbb{R}}_{\geq 0}\]
The MSE loss formulation is the L2 loss normalized with respect to the size of the data set, with the mean often being more suggestive than a normal sum of errors.
The symbol \(y\) represents the ground truth in a sample in machine learning. Samples come in pairs with the input and the ground truth or "target output"
The symbol for a model is \(h\). It represents a machine learning model that takes an input and gives an output.
The symbol \(u\) represents the input of a model.
Assume we want to fit a quadratic polynomial to the values \( \htmlClass{sdt-0000000037}{y} = (1, 0, 2) \) generated from the parabola \( y = 0 + \frac{1}{2} x + \frac{3}{2} x^2 \).
We choose a model \( \htmlClass{sdt-0000000084}{h} \) in the form of a quadratic polynomial: \( \htmlClass{sdt-0000000084}{h}(\htmlClass{sdt-0000000103}{u}_i) = a_0 + a_1 \htmlClass{sdt-0000000103}{u}_i + a_2 \htmlClass{sdt-0000000103}{u}_i^2 \) with unknown coefficients \( a_0, a_1, a_2 \).
Now consider the inputs \( \htmlClass{sdt-0000000103}{u} = (-1, 0, 1) \) with model predictions \( \htmlClass{sdt-0000000084}{h}(\htmlClass{sdt-0000000103}{u}) = (0, 1, 4) \).
The MSE loss is:
\[ \begin{align*} \htmlClass{sdt-0000000072}{L}_\text{MSE} &= \frac{1}{N} \sum_{i=1}^{N} (\htmlClass{sdt-0000000084}{h}(\htmlClass{sdt-0000000103}{u}_i) - \htmlClass{sdt-0000000037}{y}_i)^2 \\ &= \frac{1}{3} \left[ (0 - 1)^2 + (1 - 0)^2 + (4 - 2)^2 \right] \\ &= \frac{1}{3}(1 + 1 +4) \\ &= \frac{12}{3} = 4 \end{align*} \]
On the other hand, the L2 loss is \(12\).