We are often confused by explanations on what ML metrics really are or represent, this post is my attempt to clarify this question. I have developed a simple mentality when it comes to me thinking of those metrics, which one I pick to evaluate my models and why.
Confusion Matrix:
When we think about the error predictions and errors, we might want to construct the confusion(truth) matrix of the model, which helps a lot visualizing it.
Predicted | |||
Positives | Negatives | ||
Real | Positives | TP | FN |
Negatives | FP | TN |
Where the “Positive”/”Negative” word at the end corresponds to the predictions of the model over a target category. Quite often, if we work with a logical(binary) classifier, we tend to call “True” or “False” to one state or the other. When we have multiple classifiers, it applies the logic of “one-vs-all”, in this case we must talk about one category being the “True” state, and the rest are “False” states.
Types of errors in Supervised Learning:
- Model err. : This is the actual error of the model. This measures the ratio of the accurate predictions of the model (in each cathegory) between all.
- Type I err. : Also called FP (false positives). This is the error that the model introduces when it classifies an event belonging to our interest category (target classes), when in fact it is not.
- Type II err. : Also called FN (false negatives). This is the error that the model introduces when it classifies an event not belonging to our interest category (target classes), when it fact it is.
Once we understand those concepts, some metrics come along with.
Metrics associated:
- (Model): \(Accuracy = {True\_predictions \over All\_predictions } = {TP + TN \over TP + TN + FP + FN}\)
- (Type I): \(Precision = {True\_predicted\_positives \over Predicted\_positives} = {TP \over TP+FP}\)
- (Type II): \(Recall = {True\_predicted\_positives \over Real\_positives} = {TP \over TP+FN}\)
- (Mix Type I/II): \(F1\_score = { True\_predicted\_positives \over TP + Average\_of\_errors} = {TP \over TP + 1/2(FP+FN)} = {2 P*R \over P+R}\)
There are other derived metrics, but they are derived from the ones above and thus, we will not cover them here.
“Just as electricity transformed almost everything 100 years ago, today I actually have a hard time thinking of an industry that I don’t think AI (Artificial Intelligence) will transform in the next several years.” – Andrew Ng
Regards