Calculate sensitivity, specificity and predictive values
These functions calculate the sensitivity, specificity or predictive values of a measurement system compared to a reference results (the truth or a gold standard). The measurement and "truth" data must have the same two possible outcomes and one of the outcomes must be thought of as a "positive" results.
negPredValue(data, ...) ## Default S3 method: negPredValue( data, reference, negative = levels(reference)[2], prevalence = NULL, ... ) ## S3 method for class 'table' negPredValue(data, negative = rownames(data)[-1], prevalence = NULL, ...) ## S3 method for class 'matrix' negPredValue(data, negative = rownames(data)[-1], prevalence = NULL, ...) posPredValue(data, ...) ## Default S3 method: posPredValue( data, reference, positive = levels(reference)[1], prevalence = NULL, ... ) ## S3 method for class 'table' posPredValue(data, positive = rownames(data)[1], prevalence = NULL, ...) ## S3 method for class 'matrix' posPredValue(data, positive = rownames(data)[1], prevalence = NULL, ...) sensitivity(data, ...) ## Default S3 method: sensitivity( data, reference, positive = levels(reference)[1], na.rm = TRUE, ... ) ## S3 method for class 'table' sensitivity(data, positive = rownames(data)[1], ...) ## S3 method for class 'matrix' sensitivity(data, positive = rownames(data)[1], ...)
data |
for the default functions, a factor containing the discrete
measurements. For the |
... |
not currently used |
reference |
a factor containing the reference values |
negative |
a character string that defines the factor level corresponding to the "negative" results |
prevalence |
a numeric value for the rate of the "positive" class of the data |
positive |
a character string that defines the factor level corresponding to the "positive" results |
na.rm |
a logical value indicating whether |
The sensitivity is defined as the proportion of positive results out of the
number of samples which were actually positive. When there are no positive
results, sensitivity is not defined and a value of NA
is returned.
Similarly, when there are no negative results, specificity is not defined
and a value of NA
is returned. Similar statements are true for
predictive values.
The positive predictive value is defined as the percent of predicted positives that are actually positive while the negative predictive value is defined as the percent of negative positives that are actually negative.
Suppose a 2x2 table with notation
Reference | ||
Predicted | Event | No Event |
Event | A | B |
No Event | C | D |
The formulas used here are:
Sensitivity = A/(A+C)
Specificity = D/(B+D)
Prevalence = (A+C)/(A+B+C+D)
PPV = (sensitivity * Prevalence)/((sensitivity*Prevalence) + ((1-specificity)*(1-Prevalence)))
NPV = (specificity * (1-Prevalence))/(((1-sensitivity)*Prevalence) + ((specificity)*(1-Prevalence)))
See the references for discussions of the statistics.
A number between 0 and 1 (or NA).
Max Kuhn
Kuhn, M. (2008), “Building predictive models in R using the caret package, ” Journal of Statistical Software, (http://www.jstatsoft.org/article/view/v028i05/v28i05.pdf).
Altman, D.G., Bland, J.M. (1994) “Diagnostic tests 1: sensitivity and specificity,” British Medical Journal, vol 308, 1552.
Altman, D.G., Bland, J.M. (1994) “Diagnostic tests 2: predictive values,” British Medical Journal, vol 309, 102.
## Not run: ################### ## 2 class example lvs <- c("normal", "abnormal") truth <- factor(rep(lvs, times = c(86, 258)), levels = rev(lvs)) pred <- factor( c( rep(lvs, times = c(54, 32)), rep(lvs, times = c(27, 231))), levels = rev(lvs)) xtab <- table(pred, truth) sensitivity(pred, truth) sensitivity(xtab) posPredValue(pred, truth) posPredValue(pred, truth, prevalence = 0.25) specificity(pred, truth) negPredValue(pred, truth) negPredValue(xtab) negPredValue(pred, truth, prevalence = 0.25) prev <- seq(0.001, .99, length = 20) npvVals <- ppvVals <- prev * NA for(i in seq(along = prev)) { ppvVals[i] <- posPredValue(pred, truth, prevalence = prev[i]) npvVals[i] <- negPredValue(pred, truth, prevalence = prev[i]) } plot(prev, ppvVals, ylim = c(0, 1), type = "l", ylab = "", xlab = "Prevalence (i.e. prior)") points(prev, npvVals, type = "l", col = "red") abline(h=sensitivity(pred, truth), lty = 2) abline(h=specificity(pred, truth), lty = 2, col = "red") legend(.5, .5, c("ppv", "npv", "sens", "spec"), col = c("black", "red", "black", "red"), lty = c(1, 1, 2, 2)) ################### ## 3 class example library(MASS) fit <- lda(Species ~ ., data = iris) model <- predict(fit)$class irisTabs <- table(model, iris$Species) ## When passing factors, an error occurs with more ## than two levels sensitivity(model, iris$Species) ## When passing a table, more than two levels can ## be used sensitivity(irisTabs, "versicolor") specificity(irisTabs, c("setosa", "virginica")) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.