Question 1

What are true positives, false positives, true negatives, and false negatives?

Accepted Answer

In binary classification, a True Positive (TP) is a case where the model predicted "positive" and the actual label is positive. A False Positive (FP) is a case where the model predicted "positive" but the actual label is negative (Type I error, also called a false alarm). A True Negative (TN) is a case where the model correctly predicted "negative". A False Negative (FN) is a case where the model predicted "negative" but the actual label was positive (Type II error, also called a miss). Every classification metric is a different arithmetic combination of these four counts.

Question 2

What is the difference between precision and recall?

Accepted Answer

Precision is the fraction of the model's positive predictions that are actually correct: TP / (TP + FP). It measures how trustworthy a positive prediction is. Recall (also called sensitivity or true positive rate) is the fraction of actual positives the model correctly identified: TP / (TP + FN). It measures how thorough the model is. A spam filter with high precision rarely marks legitimate emails as spam. A cancer screening tool with high recall rarely misses a true case. There is an inherent trade-off: increasing one often decreases the other.

Question 3

What is MCC and why is it better than accuracy for imbalanced data?

Accepted Answer

The Matthews Correlation Coefficient (MCC) is calculated as (TP * TN - FP * FN) / sqrt((TP+FP)(TP+FN)(TN+FP)(TN+FN)). It produces a value between -1 and +1 that takes all four cells of the confusion matrix into account equally. A score of +1 means perfect predictions, 0 means the model is no better than random guessing, and -1 means the model always predicts the wrong class. Because MCC factors in both classes symmetrically, it cannot be inflated by a class imbalance the way accuracy can, making it the preferred single-number summary when your dataset is imbalanced.

Question 4

What is specificity and how is it different from precision?

Accepted Answer

Specificity (True Negative Rate) is TN / (TN + FP) - the fraction of actual negatives the model correctly identifies. Precision is TP / (TP + FP) - the fraction of predicted positives that are actually positive. Both deal with false positives, but from different perspectives: specificity looks at what fraction of real negatives were safely rejected, while precision looks at what fraction of the model's positive flags can be trusted. In medical testing, specificity tells you how good the test is at ruling out the disease; precision tells you how likely a positive test result actually means the disease is present (this depends heavily on prevalence).

Question 5

How do I choose between F1 and F2 score?

Accepted Answer

Both are weighted harmonic means of precision and recall. F1 weights them equally. F2 (beta = 2) gives recall twice the weight of precision, making it the right choice when missing a positive (a false negative) is costlier than a false alarm. Use F1 when precision and recall errors are equally costly; use F2 in scenarios like fraud detection or medical diagnosis where failing to catch a true positive causes more harm than an occasional false alarm.

Question 6

What is balanced accuracy and when should I use it?

Accepted Answer

Balanced accuracy is the average of sensitivity (recall) and specificity: (TPR + TNR) / 2. Unlike raw accuracy, it is not inflated by class imbalance because it gives equal weight to both classes regardless of how many samples are in each. Use it when your dataset is imbalanced and you want a simple percentage metric that remains interpretable, as an alternative to MCC which is harder to explain to a non-technical audience.

Question 7

Can accuracy be high while the model is still bad?

Accepted Answer

Yes. If 95% of your samples are negative, a classifier that always predicts "negative" achieves 95% accuracy while having 0% recall - it never detects a positive case. This is why you should always check MCC, F1, recall, and balanced accuracy alongside accuracy, especially when the positive class is rare. This calculator computes all of these automatically so you can spot the problem at a glance.

Metric	Formula	Ideal value	Range
Accuracy	(TP+TN) / total	1.0 (100%)	0 to 1
Precision (PPV)	TP / (TP+FP)	1.0 (100%)	0 to 1
Recall (Sensitivity)	TP / (TP+FN)	1.0 (100%)	0 to 1
Specificity (TNR)	TN / (TN+FP)	1.0 (100%)	0 to 1
F1 Score	2PR / (P+R)	1.0 (100%)	0 to 1
F2 Score	5PR / (4P+R)	1.0 (100%)	0 to 1
MCC	(TPTN - FPFN) / sqrt(...)	+1.0	-1 to +1
Balanced Accuracy	(Recall + Specificity) / 2	1.0 (100%)	0 to 1
NPV	TN / (TN+FN)	1.0 (100%)	0 to 1
FPR (Fall-out)	FP / (FP+TN)	0 (0%)	0 to 1
FNR (Miss rate)	FN / (FN+TP)	0 (0%)	0 to 1
FDR	FP / (FP+TP)	0 (0%)	0 to 1
Prevalence	(TP+FN) / total	varies	0 to 1

Confusion Matrix Calculator

Your details

What is a confusion matrix?

How to read your results: the key metrics explained

Precision vs recall: choosing the right trade-off

Imbalanced datasets and why accuracy can lie

Classification metric quick reference

Frequently asked questions

Sources