prop.test confidence intervals (PR#2794) - R-devel

Fri, Apr 18, 2003 12:01 PM #

Full_Name: Robert W. Baer, Ph.D.
Version: 1.6.2
OS: Windows 2000
Submission from: (NULL) (198.209.172.106)


Problem:  prop.test() does not seem to produce appropriate confidence intervals
for the case where the vector length of x and n is one.  (I am not certain about
higher vector lengths.)

As an example, I include x=6 and n=42 which has a mean proportion of 0.115. 
When I calculate the 95% CI using the normal approximation by hand (and no
continuity correction) I get (0.028, 0.202).  The exact binomial CI from
binom.test() is (0.044, 0.234).  With correct=FALSE prop.test produces CI95 =
(0.05396969, 0.22971664) which is neither of these.  With correct=TRUE it
produces  (0.04778925, 0.2412937) This seems reasonably like a normal
approximation 95% CI (which I presume is what is used by prop.test()) of the
true binomial but I did not actually check it by hand.

BUG summary.  The prop.test() calculation of 95% CI of sample proportions is
improperly calculated when continuity correction is turned off.

-----------------------------
Sample R code and output shown below:

1-sample proportions test with continuity correction

data:  x out of n, null probability 0.5 
X-squared = 29.25, df = 1, p-value = 6.362e-08
alternative hypothesis: true p is not equal to 0.5 
95 percent confidence interval:
 0.04778925 0.24129372 
sample estimates:
        p 
0.1153846

1-sample proportions test without continuity correction

data:  x out of n, null probability 0.5 
X-squared = 30.7692, df = 1, p-value = 2.906e-08
alternative hypothesis: true p is not equal to 0.5 
95 percent confidence interval:
 0.05396969 0.22971664 
sample estimates:
        p 
0.1153846

Exact binomial test

data:  x and n 
number of successes = 6, number of trials = 52, p-value = 1.033e-08
alternative hypothesis: true probability of success is not equal to 0.5 
95 percent confidence interval:
 0.0435412 0.2344083 
sample estimates:
probability of success 
             0.1153846

Thomas Lumley

Fri, Apr 18, 2003 1:54 PM #

On Fri, 18 Apr 2003 rbaer@kcom.edu wrote:

No, it isn't a bug.  It uses a Normal approximation, but not the one you
were using. It is based on inverting the score test rather than the Wald
test, and is substantially more accurate.

Checking by simulation: n=42, p=0.1, 10,000 replicates, the coverage from
prop.test without correction was 96.7%, with correction was 97.9% and from
the Wald-based Normal approximation was 92.3%.


	-thomas

Peter Dalgaard

Fri, Apr 18, 2003 4:11 PM #

rbaer@kcom.edu writes:

...n=52...

Uhm... Basically, we know the correct answer from binom.test, and R's
intervals are considerably closer to that than the textbook p+-2*se(p)
formula. So R has a bug because it isn't inaccurate enough??

This might enlighten you:

prop.test(6,52,p=.05396969,alt="g",correct=F)
prop.test(6,52,p=.22971664,alt="l",correct=F)

also, consider the case x=0.

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907