Back to formatted view
Raw Message

Message-ID: <f8e6ff050712171310l381a275eq75da61305c640573@mail.gmail.com>
Date: 2007-12-17T21:10:32Z
From: Hadley Wickham
Subject: regression towards the mean, AS paper November 2007
In-Reply-To: <36F293BA-497C-43D7-8943-7AFD3C8648A4@auckland.ac.nz>

>         This has nothing to do really with the question that Troels asked,
>         but the exposition quoted from the AA paper is unnecessarily confusing.
>         The phrase ``Because X0 and X1 have identical marginal
> distributions ...''
>         throws the reader off the track.  The identical marginal distributions
>         are irrelevant.  All one needs is that the ***means*** of X0 and X1
>         be the same, and then the null hypothesis tested by a paired t-test
>         is true and so the p-values are (asymptotically) Uniform[0,1].  With
>         a sample size of 100, the ``asymptotically'' bit can be safely ignored
>         for any ``decent'' joint distribution of X0 and X1.  If one further
>         assumes that X0 - X1 is Gaussian (which has nothing to do with X0 and
>         X1 having identical marginal distributions) then ``asymptotically''
>         turns into ``exactly''.

Another related issue is that uniform distributions don't look very uniform:

hist(runif(100))
hist(runif(1000))
hist(runif(10000))

Be sure to calibrate your eyes (and your bin width) before rejecting
the hypothesis that the distribution is uniform.

Hadley

-- 
http://had.co.nz/