regression towards the mean, AS paper November 2007
On 18/12/2007, at 7:32 AM, Duncan Murdoch wrote:
On 12/17/2007 1:21 PM, Troels Ring wrote:
Dear friends, regression towards the mean is interesting in medical
circles, and a very recent paper (The American Statistician November
2007;61:302-307 by Krause and Pinheiro) treats it at length. An
initial
example specifies (p 303):
"Consider the following example: we draw 100 samples from a bivariate
Normal distribution with X0~N(0,1), X1~N(0,1) and cov(X0,X1)=0.7, We
then calculate the p value for the null hypothesis that the means
of X0
and X1 are equal, using a paired Student's t test. The procedure is
repeated 1000 times, producing 1000 simulated p values. Because X0
and
X1 have identical marginal distributions, the simulated p values
behave
like independent Uniform(0,1) random variables." This I did not
understand, and simulating like shown below produced far from uniform
(0,1) p values - but I fail to see how it is wrong. I contacted the
authors of the paper but they did not answer. So, please, doesn?t the
code below specify a bivariate N(0,1) with covariance 0.7? I get p
values = 1 all over - not interesting, but how wrong?
Best wishes
Troels
library(MASS)
Sigma <- matrix(c(1,0.7,0.7,1),2,2)
Sigma
res <- NULL
for (i in 1:1000){
ff <-(mvrnorm(n=100, rep(0, 2), Sigma, empirical = TRUE))
res[i] <- t.test(ff[,1],ff[,2],paired=TRUE)$p.value}
Specifying empirical=TRUE means that your sampled values are not independent, the means are guaranteed to match exactly, and the mean difference is exactly zero. Thus all of the t statistics are exactly zero, and the p-values are exactly 1. Set empirical=FALSE (the default), and you'll see more reasonable results.
This has nothing to do really with the question that Troels asked, but the exposition quoted from the AA paper is unnecessarily confusing. The phrase ``Because X0 and X1 have identical marginal distributions ...'' throws the reader off the track. The identical marginal distributions are irrelevant. All one needs is that the ***means*** of X0 and X1 be the same, and then the null hypothesis tested by a paired t-test is true and so the p-values are (asymptotically) Uniform[0,1]. With a sample size of 100, the ``asymptotically'' bit can be safely ignored for any ``decent'' joint distribution of X0 and X1. If one further assumes that X0 - X1 is Gaussian (which has nothing to do with X0 and X1 having identical marginal distributions) then ``asymptotically'' turns into ``exactly''. cheers, Rolf Turner ###################################################################### Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ######################################################################