The goal of all supervised learning algorithms is to determine the optimal model \( \htmlClass{sdt-0000000002}{\hat{f}} \) for a problem which yields the lowest risk. They accomplish this task by evaluating all candidate models \( \htmlClass{sdt-0000000084}{h} \) from the hypothesis space \( \htmlClass{sdt-0000000039}{\mathcal{H}} \) and selecting the model with the minimum risk. This risk calculation considers a chosen loss function and how the model is expected to perform on unseen data, aiming to find the model that generalizes best.
\( \hat{f} \) | This symbol denotes the optimal model for a problem. |
\( U \) | This symbol represents a random variable for the inputs of a problem. |
\( Y \) | This symbol represents a random variable for the outputs of a problem. |
\( E \) | This symbol represents the average value of a distribution associated with a random variable. |
\( \mathcal{H} \) | This is the symbol representing the set of possible models. |
\( L \) | This is the symbol for a loss function. It is a function that calculates how wrong a model's inference is compared to where it should be. |
The symbol \(\hat{f}\) denotes the optimal model for a problem. It yields the lowest risk \( \htmlClass{sdt-0000000062}{R} \) for pairs of inputs and outputs. The goal of machine learning is to optimize \( \htmlClass{sdt-0000000084}{h} \) until it becomes \(\hat{f}\).
The symbol for a model is \(h\). It represents a machine learning model that takes an input and gives an output.
The symbol \( \mathcal{H} \) denotes the set of possible models, often from a particular class like "polynomials of any degree" or "multi-layer perceptron networks". For any learning algorithm, \( \mathcal{H} \) indicates the space where an optimal model may be found.
See Purpose of Machine Learning for an empirical method to obtain the best model and Risk of Optimal Model for an example on how to use the \(argmin\) operator.