A concrete type I/III Sum of square problem
Gregor Gorjanc <gregor.gorjanc at gmail.com> writes:
WPhantom <wp1 at tiscali.fr> writes:
Thanks Brian for the reference. I just discover that it is available in our library so I going to take it & read it soon. Actually, I don't even know the difference between a multistratum vs a single-stratum AOV. A quick search on google returned me the R materials so that I imagine that these concepts are quite specific to R.
You have to be careful not to confuse Google's view of the world with Reality... The concept of error strata is much older than R, and existed for instance in Genstat, anno 1977 or so. However, Genstat seems to have left little impression on the Internet.
I will read the book first before asking for more informations.
The executive summary is that the concept of error strata relies substantially on having a balanced design (at least for the random effects), so that the analysis can be decomposed into analyses of means, contrasts, and contrasts of means. For unbalanced designs, you usually get meaningless analyses.
Can you (prof. Dalgaard) please point us to relevant book with these topics. I am very interested in it since my data are often unbalanced.
Hmm, the Danish tradition is highly based on lecture notes, so I don't have a specific book for you. One possible starting point is Tue Tjur (1984): Analysis of variance designs in orthogonal designs. Int.Statist.Review 52, 33-81. The thing to notice in relation to that paper is that the decomposition (p.55) of the covariance matrix as sum(lambda_B Q_B^0) is highly dependent on having an orthogonal design. Without the orthogonality, it still defines a model, but typically one without a sensible interpretation. Look at a simple 1-way anova with three groups of equal size. The Q matrices will be the projections P_X and I-P_X, where X is the design matrix for the grouping factor, e.g.
X <- model.matrix(~factor(rep(1:3,each=2))) X
(Intercept) factor(rep(1:3, each = 2))2 factor(rep(1:3, each = 2))3 1 1 0 0 2 1 0 0 3 1 1 0 4 1 1 0 5 1 0 1 6 1 0 1 ... P_X can be found in the following semi-secret way:
P <- stats:::proj.matrix(X) P
1 2 3 4 5 6 1 0.5 0.5 0.0 0.0 0.0 0.0 2 0.5 0.5 0.0 0.0 0.0 0.0 3 0.0 0.0 0.5 0.5 0.0 0.0 4 0.0 0.0 0.5 0.5 0.0 0.0 5 0.0 0.0 0.0 0.0 0.5 0.5 6 0.0 0.0 0.0 0.0 0.5 0.5 Suppose we put a random component of 10 on P_X and 1 on (I-P_X). We then get
diag(6) - P + 10*P
1 2 3 4 5 6 1 5.5 4.5 0.0 0.0 0.0 0.0 2 4.5 5.5 0.0 0.0 0.0 0.0 3 0.0 0.0 5.5 4.5 0.0 0.0 4 0.0 0.0 4.5 5.5 0.0 0.0 5 0.0 0.0 0.0 0.0 5.5 4.5 6 0.0 0.0 0.0 0.0 4.5 5.5 which is a perfectly sensible covariance for within-group correlated data. Now try the same stunt with unbalanced data:
X <- model.matrix(~factor(rep(1:3,1:3))-1) P <- stats:::proj.matrix(X) diag(6) - P + 10*P
1 2 3 4 5 6 1 10 0.0 0.0 0 0 0 2 0 5.5 4.5 0 0 0 3 0 4.5 5.5 0 0 0 4 0 0.0 0.0 4 3 3 5 0 0.0 0.0 3 4 3 6 0 0.0 0.0 3 3 4 I.e. we are de facto assuming that observations in the smaller group have a larger variance than observations in the larger groups.
Thanks Sylvain Cl?ment At 12:38 14/02/2006, you wrote:
More to the point, you are confusing multistratum AOV with single-stratuam AOV. For a good tutorial, see MASS4 (bibliographic information in the R FAQ). For unbalanced data we suggest you use lme() instead.
I do not have the whole book in my head as prof. Ripley probably does,
but I can not recall to read about this in MASS4. I am sure I am wrong
and would you (prof. Ripley) be please so kind and point us to relevant
chapters/pages.
Many thanks.
--
Lep pozdrav / With regards,
Gregor Gorjanc
----------------------------------------------------------------------
University of Ljubljana PhD student
Biotechnical Faculty
Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan
Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si
SI-1230 Domzale tel: +386 (0)1 72 17 861
Slovenia, Europe fax: +386 (0)1 72 17 888
----------------------------------------------------------------------
"One must learn by doing the thing; for though you think you know it,
you have no certainty until you try." Sophocles ~ 450 B.C.
----------------------------------------------------------------------
O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907