Trouble about the interpretation of intercept in lm models
Marc Schwartz wrote:
DF.fitted
Y A B F.lm 1 21.86773 0 a 23.52957 2 25.91822 0 a 23.52957 3 20.82186 0 a 23.52957 4 42.97640 1 a 36.18023 5 36.64754 1 a 36.18023 6 30.89766 1 a 36.18023 7 47.43715 0 b 46.50615 8 48.69162 0 b 46.50615 9 47.87891 0 b 46.50615 10 53.47306 1 b 59.15681 11 62.55891 1 b 59.15681 12 56.94922 1 b 59.15681 13 61.89380 0 c 62.98442 14 53.92650 0 c 62.98442 15 70.62465 0 c 62.98442 16 74.77533 1 c 75.63508 17 74.91905 1 c 75.63508 18 79.71918 1 c 75.63508 # Now get the means of the fitted values across # the combinations of A and B M <- with(DF.fitted, tapply(F.lm, list(A = A, B = B), mean))
M
B A a b c 0 23.52957 46.50615 62.98442 1 36.18023 59.15681 75.63508 Thus: # Intercept = *fitted* mean at A = 0; B = "a"
M["0", "a"]
[1] 23.52957
Actually, notice that you are averaging identical values, so the "mean" in the tapply is slightly misleading. Notice also that the intercept may be defined even when _no_ observations have zero entries in the design matrix. This is the usual case in linear regression, for instance, but it can happen in factorial designs (unbalanced, or using other than treatment contrasts) as well.
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907