Specificity (TNR)

Specificity measures: of all the actual negatives, what fraction did the model correctly identify as negative?

$specificity = \frac{TN}{TN + FP}$

The denominator is all the negatives — those the model correctly classified (TN) plus those it incorrectly flagged (FP).

Also called the true negative rate (TNR). It’s the mirror image of recall (sensitivity, TPR):

Recall (TPR) — fraction of actual positives caught.
Specificity (TNR) — fraction of actual negatives correctly identified.

High specificity means the model rarely raises false alarms. The applications that care most about specificity are those where a false alarm is costly:

Spam filtering, where a legitimate email going to the spam folder (FP) is more harmful than a few spam messages getting through.
Drug testing, where wrongly flagging an innocent person is worse than missing some users.
Quality control, where wrongly rejecting good product is more expensive than missing the occasional defect.

The complement of specificity is the False positive rate (FPR):

$FPR = 1 - specificity = \frac{FP}{FP + TN}$

FPR is what the ROC curve plots on the x-axis (against TPR on the y-axis), so specificity and FPR are essentially the same information in different forms.

The recall-specificity tradeoff

Tuning a classifier’s threshold trades recall against specificity:

Lower threshold (more eager to predict positive) → higher recall, lower specificity.
Higher threshold (more conservative) → lower recall, higher specificity.

The ROC curve traces out this tradeoff across all thresholds. The right operating point depends on the costs in the application.

In scikit-learn

scikit-learn doesn’t have a standalone specificity_score, but it’s easy to compute from the Confusion matrix:

from sklearn.metrics import confusion_matrix
tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel()
specificity = tn / (tn + fp)

Or equivalently as the recall of the negative class — pass pos_label=0 to recall_score. The wine-quality classifier from the Introduction to Data Science textbook ends up with recall 0.84 and specificity 0.55 — a real asymmetry, the kind of thing accuracy alone wouldn’t reveal.

Idriss Rami — Notes

Explorer

Specificity (TNR)

The recall-specificity tradeoff

In scikit-learn

Graph View

Table of Contents

Backlinks