Skip to content

Lack of independence in anova()

27 messages · Douglas Bates, Thomas Lumley, Spencer Graves +6 more

Messages 26–27 of 27

#
My first reaction to Duncan's example was "Touch?? -- with apologies
to G??ran for suspecting on over-trivial example"! I had not thought
long enough about possible cases. Duncan is right; and maybe it is
the same example as G??ran was thinking of.

Regarding Spencer's argument below, in Duncan's statement he says
"Z is supported on +/- A" (i.e. Z = A or Z = -A),
so P(|Z| < 1) = 0 and so Spencer's 1-2z = 0 and z=1/2 (but Spencer
stipulates that Z is symmetric).

In general, suppose P(Z = A) = p > 0 and P(Z = -A) = q = 1-p.

Since X and Y are symmetric, X/A has the same distribution
as X/(-A) and similarly for Y; hence for any v and w,
P(X/Z <= v | X = z) is independent of z = +/- A, therefore
= P(X/Z <= v); and similarly for Y.

Also X/A, Y/A are independent, and so are X/(-A) and Y/(-A).

Hence P(X/Z <= v and Y/Z <= w)

    = p*P(X/Z <= v | Z = A)*P(Y/Z <= w | Z = A)

      + q*P(X/Z <= v | Z = -A)*P(Y/Z <= w | Z = -A)

    = (p + q)*P(X/Z <= v)*P(Y/Z <= w)

    = P(X/Z <= v)*P(Y/Z <= w)

so X/Z and Y/Z are independent.

However, interesting though it maybe, this is a side-issue
to the original question concerning independence of the F-ratios
in an ANOVA. Here, numerators and denominator are all positive,
so examples like the above are not relevant.

The original argument (that increasing Z diminishes both X/Z
and Y/Z simultaneously) applies; but it is also possible to
demonstrate analytically that P(X/Z <= v and Y/Z <= w) is
greater than P(X/Z <= v)*P(Y/Z <= w).

The original issue also was that, in R, there might be a bug
in anova(). However, one can, in R and independently of the
behaviour of anova(), demonstrate this positive correlation:

  C<-numeric(10000);
  for(i in (1:10000)){
    X<-rchisq(1000,5)/5
    Y<-rchisq(1000,5)/5
    Z<-rchisq(1000,20)/20
    C[i]<-cor(X/Z,Y/Z)
  }
 hist(C)

which shows that all 10000 correlations are positive.

Best wishes to all,
Ted.
On 07-Jul-05 Spencer Graves wrote:
with
This is
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 07-Jul-05                                       Time: 11:18:04
------------------------------ XFMail ------------------------------
#
On Thu, Jul 07, 2005 at 11:18:09AM +0100, Ted Harding wrote:
No need to apologize; that was of course my first reaction to Thomas'
statement.
On second thought it was not difficult to find: (X, Y) bivariate standard
normal, P(Z = 1) = P(Z = -1) = 1/2.

[...]
Maybe it is simplest to calculate Cov(X/Z, Y/Z), which turns out to be
equal to E(X)E(Y)V(1/Z) (given total independence). So, a necessary
condition for independence is that at least one of these three terms is
zero. Which is impossible in the F-ratios case.

G??ran