Skip to content

prop.test in R

10 messages · Ralph O'Brien, PhD, Laura Chihara, Albyn Jones +3 more

#
Hi,

I have a question about prop.test in R:

I teach students the score confidence
interval for proportions (also called
Wilson or Wilson score interval).

prop.test(,..., correct=FALSE) gives this
interval.

The default uses a continuity correction.
When should we use one over the other?
Is it worth going over this in class? Why
is correct=TRUE the default?

Thanks for any pedagogical guidance here!

-- Laura

*******************************************
Laura Chihara
Professor of Mathematics   507-222-4065 (office)
Dept of Mathematics        507-222-4312 (fax)
Carleton College
1 North College Street
Northfield MN 55057
#
Yes, thank you for this reference. But according to
this article, the score is better than continuity
correction, so why is continuity correction the default
with prop.test?

-Laura
On 10/25/2010 4:02 PM, Ralph O'Brien, PhD wrote:

  
    
#
Laura,

I would make the argument that continuity correction should not be used in practice, or in the classroom. Continuity corrected intervals are, on average, to wide. This might be defensible if they guaranteed their coverage level (as 'exact' distribution based intervals do), but due to the fact that they are asymptotic, they may have coverage less than the nominal level.

The Agresti reference Ralph sent is an excellent article. I highly recommend it. I find it helpful to categorize discrete tests on two axes. conservative vs. approximate, and asymptotic vs distribution based. Conservative tests attempt to keep type 1 error less than the nominal level, and approximate tests attempt to keep the error near its nominal level.

				Asymptotic				Distribution based
Conservative	Continuity Corrected		Standard 'Exact' test

Approximate		Standard Asymptotic		Mid p-value



I would also be interested to hear why the default it correct=TRUE. Perhaps it is historical.

Ian
On Oct 25, 2010, at 1:38 PM, Laura Chihara wrote:

            
#
I don't know, the help file is uninformative.  I'd guess the answer is
"the author wrote it that way".  Other R functions like t.test include
similar unfortunate (to me) default choices, in that case
var.equal=FALSE (ie the Welch test) is the default.

albyn
On Mon, Oct 25, 2010 at 04:15:20PM -0500, Laura Chihara wrote:

  
    
#
In the case of the t.test, having the default be var.equal=TRUE is the right way to go. There is little to no power lost by using the welch test, and the assumption of equal variance can be difficult to assess. For this reason, many introductory text books have now banished the equal variance t-test from their chapters (e.g. Moore's The Basic Practice of Statistics).

Ian
On Oct 25, 2010, at 4:05 PM, Albyn Jones wrote:

            
#
Exactly - elementary texts and methods books recommend the welch test  
for the reason you mention.  Curiously, those same texts recommend  
using anova and regression without automatically correcting for the  
possibility of non-constant variance.  Why is the case of comparing  
two means different from 3?  Those same books will tell you that anova  
is pretty robust to non-constant variance.  well, the two sample  
t-test is anova.

I don't use the welch test except as a conscious decision: ie I really  
want to compare the means while suspecting that the variances differ.   
Generally people are using the t test to certify that two populations  
are different.  If the variances are wildly different, that may be  
much more important than a difference in means.  in fact, to test for  
a difference in means when the variances are wildly different is  
almost always substantively silly.   There was a great example a few  
years ago from a psychiatric journal, comparing two medications, where  
the investigators did a t-test for the means when one distribution was  
unimodal and the other was bi-modal; there was no statistically  
significant difference in the means, but there was a really important  
difference in the distributions.  The automatic use of the welch test  
makes you feel that you are protected against Bad Things, when you  
aren't.

albyn

Quoting Ian Fellows <ian.fellows at stat.ucla.edu>:
#
On Tue, Oct 26, 2010 at 4:42 AM, Adams, Zeno <Zeno.Adams at ebs.edu> wrote:
Only if the order of the observations in each sample is fixed.  I
don't want to sound facetious but the important characteristic of the
samples in a paired t-test is that they are paired.  The first
observation in sample 1 is associated in some way with the first
observation in sample 2, say because they are observations on the same
subject or at the same location or ...

If there is no pairing then one of the samples could be rearranged
without changing the other, thereby changing the covariance.

Because of the pairing the sample sizes in a paired t-test must be
equal.  But a t-test for independent samples can be used when the
sample sizes are unequal.  So, no, the t-test for independent samples
is not a special case of the paired t-test.
#
I agree with you that the presentation is unfortunate. Perhaps it has
something to do with the fact that heteroskedastic consistent covariance
matrices (HCCM) for linear regression are a relatively recent development
(by White and Huber in the early 80s), and initially they performed poorly
for small sample sizes. From a pedegogy standpoint the derivations of the
formulas for HCCM are beyond the scope of an undergraduate course whereas
the equal variance versions can be easily derived.

Given more recent simulation studies showing the power and level of tests
based on HCCM are comparable with equal variance regression, and that
there is rarely any reason to apriori think that the variances are equal.

The anova is robust to violations so long as the group sizes are equal. if
they aren't then it isn't.
You may not suspect that the variances are different, but there is no
apriori reason to think that they are equal. Why should you assume
something you have no reason to believe is true? In my experience, people
are not using the t-test to say that two populations are in some general
way different, but rather specifically that the means vary. This is an
important question regardless of whether the variances are equal.

In your medication example, the shape of the two distributions was
different, but when making the decision of whether to approve a
medication, the more important question is whether the central tendency is
different. Does one medication on average improve the outcome more than
another. A secondary, though important, question is how variable the
outcome is. The investigators made a correct inference (in stating no
significant mean difference between the groups), but they missed an
important question that they could have asked their data. This omission
has nothing to do with the t-test.

Using heteroskedastic robust methods DO protect against "Bad Things." What
they don't do is reveal the existence  important data trends unrelated to
their hypothesis of interest.


Ian