Epistemic and aleatoric uncertainty in machine learning
Error in a machine learning prediction is a combination of epistemic and aleatoric uncertainty. Understanding the two is essential for model improvement and explaining the model performance. For example, getting more data can decrease only the epistemic uncertainty but not the aleatoric uncertainty, meaning you are tackling only one source of the error.
Assume we would like to discover the relationship between one input and one output variable. We collected the data (Fig. 1 - blue dots) and fit a line (Fig. 1 - orange line). In practice, we almost never know the true relationship, but in this case, it is known since the data has been generated synthetically (Fig. 1 - blue line).
With Figure 1 we can easily show what epistemic and aleatoric uncertainty are. Epistemic uncertainty is the difference between the true model (blue line, what we aim for) and our model (orange line, our fit). Even if we reach the true model, there will still be some uncertainty left which is the aleatoric uncertainty.
The difference between model prediction (orange line) and an observation (blue dot) is the error (red line), which is the sum of epistemic and aleatoric uncertainty.
After understanding what these errors are, we can now understand how to reduce them. Getting more data, hyper-parameter tuning, model selection will help you with handling the epistemic uncertainty. Figure 2 shows how getting more data decreases the epistemic uncertainty but not the aleatoric uncertainty.