Skip to content

Sensitivity, specificity, and predictive values

2 messages · Michael Hills, dcm2104 at columbia.edu

#
All of the examples cited in this discussion assume that a single sample
of subjects is taken from a population and then classified as disease
positive or negative, using the reference test. When this is the case
the true prevalence can also be obtained from the sample, but in many
situations separate samples are taken to estimate sensitivity and
specificity, so that the proportion of subjects who are disease positive
depends on the sample sizes chosen, and no estimate of prevalence is
possible. 

In this case the sensitivity and specificity can be estimated as before
and then applied to a population in which the true prevalence of the
disease is p to give the predictive odds of a positive test in that
population, namely

p/(1-p) x Sens/(1-Spec) = p/(1-p) x LR

so the CI for the predictive odds of a positive test is directly related
to the CI for the LR.

The epicentre package does provide an interval for the LR but it seems
likely that this is based on a single sample not two separate samples.
For two separate samples a method for finding the CI for the ratio of
two independent proportions (Sens and 1-Spec) is required. Any
suggestions for doing this in R?

Michael Hills
#
Hi,

A good way to circumvent many of the aforementioned limitations is to  
resort to non-parametrical ordinary boostrapping whereby you re-sample  
your dataset B times (B is typically greater than 5000 but rarely  
smaller than 1000 unless your original data-set is very small or  
computational time is too expensive). You can then calculate the  
sensitivity, specificity, PPV, and NPV for each re-sampled dataset.  
Finally, you estimate the mean and confidence interval for  
bootstrap-generated sensitivity, specificity, PPV, and NPV  
distributions.

If applicable, you can use these distributions to comapre two or more  
test diagnositic. For example, you can sample the sensitivity  
distribution of two test diagnostic (via, e.g., bootstrap again or  
permutation), computing their differences, and then testing (t-test)  
whether the final distribution has a zero-mean.

The same procedure applies to other estimates (e.g., specificity, PPV,  
etc) and other tests along the same line may be constructed. You can  
load library(boot) and type "?boot" in the terminal for further  
information.

If neither test is a "gold standard," the estimation of  
prevalence-dependent PPV and NPV is considerably more complicated.

Daniel