Interpreting summary.lm for a 2 factor anova
As Petr Pikal mentioned, the difficulty in interpretation is entirely due
to the set of contrasts you chose.The default treatment contrasts are
not orthogonal and are therefore the most difficult to interpret.
The note in ?aov warns of this difficulty.
sum contrasts will give you numbers that are easiest to interpret.
options(contrasts = c("contr.sum", "contr.poly"))
warpbreakssum.aov <- aov(breaks ~ wool * tension, data = warpbreaks)
coef(warpbreakssum.aov)
model.tables(warpbreakstreatment.aov, type="effects")
model.tables(warpbreakstreatment.aov, type="means")
John Fox showed the algebra using the default treatment contrasts
For full understanding you will need to read in a text more about
sets of linear contrasts and their algebra.
I recommend Section 10.3 in mine, of course.
Statistical Analysis and Data Display:
An Intermediate Course with Examples in R
Heiberger, Richard M., Holland, Burt
http://www.springer.com/us/book/9781493921218
On Sat, Dec 3, 2016 at 11:46 PM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
Dear Sir, Many thanks for the explanation. Prior to your email (with some help from a friend of mine) I was able to figure this one out. If we look at the model : - y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM + B5.woolB.TensionH + error Here woolB, tensionM, tensionH are the dummy indicator variables similar to how you have defined them. Now suppose we consider y1,..,yn, all in group A.L (say). Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 + 0 + 0 + 0. This should be : y1 + ... yn = n . intercept
What was confusing me was how to compute the cell mean in woolB,tensionH
cell. If we have y_1,...,y_n all in group B.H then :- y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 + B5 This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 + B5 )
Therefore average of group B.H = intercept + B1 + B3 + B5 Many thanks and Best Regards, Ashim On Sat, Dec 3, 2016 at 7:15 PM, Fox, John <jfox at mcmaster.ca> wrote:
Dear Ashim, Sorry to chime in late, and my apologies if someone has already pointed this out, but here's the relationship between the cell means and the model coefficients, using the row-basis of the model matrix: -------------------------- snip ------------------------
means <- with( warpbreaks, tapply( breaks, interaction(wool, tension),
mean ) )
x.A <- rep(c(0, 1), 3) x.B1 <- rep(c(0, 1, 0), each=2) x.B2 <- rep(c(0, 0, 1), each=2) x.AB1 <- x.A*x.B1 x.AB2 <- x.A*x.B2 X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2) X.basis
x.A x.B1 x.B2 x.AB1 x.AB2 [1,] 1 0 0 0 0 0 [2,] 1 1 0 0 0 0 [3,] 1 0 1 0 0 0 [4,] 1 1 1 0 1 0 [5,] 1 0 0 1 0 0 [6,] 1 1 0 1 0 1
solve(X.basis, means)
x.A x.B1 x.B2 x.AB1 x.AB2 44.55556 -16.33333 -20.55556 -20.00000 21.11111 10.55556
coef(aov(breaks ~ wool * tension, data = warpbreaks))
(Intercept) woolB tensionM tensionH woolB:tensionM
44.55556 -16.33333 -20.55556 -20.00000 21.11111
woolB:tensionH
10.55556
-------------------------- snip ------------------------
I hope this helps,
John
-----------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox
-----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashim
Kapoor
Sent: December 3, 2016 12:19 AM To: David Winsemius <dwinsemius at comcast.net> Cc: r-help at r-project.org Subject: Re: [R] Interpreting summary.lm for a 2 factor anova Please allow me to rephrase myquery.
model.tables(model,"m")
Tables of means
Grand mean
28.14815
wool
wool
A B
31.037 25.259
tension
tension
L M H
36.39 26.39 21.67
wool:tension
tension
wool L M H
A 44.56 24.00 24.56
B 28.22 28.78 18.78
The above is the same as :
with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
A.L B.L A.M B.M A.H B.H
44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
For reference:
model <- aov(breaks ~ wool * tension, data = warpbreaks) summary.lm(model)
Call:
aov(formula = breaks ~ wool * tension, data = warpbreaks)
Residuals:
Min 1Q Median 3Q Max
-19.5556 -6.8889 -0.6667 7.1944 25.4444
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 44.556 3.647 12.218 2.43e-16 ***
woolB -16.333 5.157 -3.167 0.002677 **
tensionM -20.556 5.157 -3.986 0.000228 ***
tensionH -20.000 5.157 -3.878 0.000320 ***
woolB:tensionM 21.111 7.294 2.895 0.005698 **
woolB:tensionH 10.556 7.294 1.447 0.154327
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 10.94 on 48 degrees of freedom
Multiple R-squared: 0.3778, Adjusted R-squared: 0.3129
F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772
Now I'll explain what is confusing me in the output of summary.lm.
Coeff of Intercept = 44.556 = cell mean for A.L. This is the base.
Coeff of woolB:L = -16.333 = 28.22222 - 44.556. This is the difference
of this
cell mean(B:L) from the base. Coeff of woolA:tensionM = -20.556 = 24.000- 44.556. This is the
difference of
this cell mean (A:M) from the base. Coeff of woolA:tensionH = -20.000 = 24.55556 - 44.556. This is the
difference
of this cell mean(A:H) from the base. This is where it stops being the difference from the base. Coeff of woolB:tensionM = 21.111 should turn out to be 28.77778 -
44.556 but
this is -15.77822 Coeff of woolB:tensionH = 10.556 should turn out to be 18.77778 -
44.556 but
this is -25.77822 In the above 2 cases, we can't say that the coefficient = cell mean -
base case.
Can you tell me what should be the statement to be made ? Best Regards, Ashim PS : My apologies for emailing my query to this list. Can you tell me
the names
of a few (active) statistics help list ? On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius <dwinsemius at comcast.net wrote:
On Dec 2, 2016, at 9:09 AM, David Winsemius <dwinsemius at comcast.net
wrote:
On Dec 2, 2016, at 6:16 AM, Ashim Kapoor <ashimkapoor at gmail.com>
wrote:
Dear Pikal, All levels except the interactions are compared to the Intercept. I'm a little confused as to what's going on in interaction terms eg. the cell wool B : tension M. It's mean is : 28.78 and 28.78 - 44.56 = -15.78 != 21.111. It's something like 44.56 (intercept) -16.333 (wool B) -.20.556 (tension M) + 21.111 (woolB:tensionM) = 28.782. I don't know how to sum up the above line in terms of differences succinctly.
The aov estimate will not exactly equal the observed mean (this is
_statistics_ after all). You should be comparing the mean of that cell to the estimate:
44.556 + (-16.33) +(-20.556) + (21.11)
A respected participant advised me to look at this more closely. In this case (and I think in most such cases) where there are the same number of parameters as there are means, the model is "saturated" and there is no difference: with( warpbreaks, tapply( breaks, interaction(wool, tension), mean )
)
A.L B.L A.M B.M A.H B.H 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778 So the B:M estimate is identical up to rounding with the observed
mean:
44.556 + (-16.33) +(-20.556) + (21.11) [1] 28.78
The difference between the observed mean and the estimated mean is
known
as a 'residual' I've also been privately but gently chided for this misstatement. Residuals are the difference between data and estimates.
and the squared sum of the all residuals is what this being
minimized
... over all the cells including the one implicitly associated with
the
Intercept.
This isn't really on-topic for Rhelp since you are not having
difficulty
in getting the R program to perform its duties, but are rather in
need of
statistical education. That not what this mailing list is set up for.
-- David.
-----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
Ashim
Kapoor Sent: Thursday, December 1, 2016 2:48 PM To: r-help at r-project.org Subject: [R] Interpreting summary.lm for a 2 factor anova Dear all, Here is a small example : -
model <- aov(breaks ~ wool * tension, data = warpbreaks) summary.lm(model)
Call:
aov(formula = breaks ~ wool * tension, data = warpbreaks)
Residuals:
Min 1Q Median 3Q Max
-19.5556 -6.8889 -0.6667 7.1944 25.4444
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 44.556 3.647 12.218 2.43e-16 ***
woolB -16.333 5.157 -3.167 0.002677 **
tensionM -20.556 5.157 -3.986 0.000228 ***
tensionH -20.000 5.157 -3.878 0.000320 ***
woolB:tensionM 21.111 7.294 2.895 0.005698 **
woolB:tensionH 10.556 7.294 1.447 0.154327
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 10.94 on 48 degrees of freedom
Multiple R-squared: 0.3778, Adjusted R-squared: 0.3129
F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772
model.tables(model,"e")
Tables of effects
wool
wool
A B
2.8889 -2.8889
tension
tension
L M H
8.241 -1.759 -6.481
wool:tension
tension
wool L M H
A 5.278 -5.278 0.000
B -5.278 5.278 0.000
model.tables(model,"m")
Tables of means Grand mean 28.14815 wool wool A B 31.037 25.259 tension tension L M H 36.39 26.39 21.67 wool:tension tension wool L M H A 44.56 24.00 24.56 B 28.22 28.78 18.78
I don't follow the output of summary.lm. I understand the output
of
model.tables for effects and means. For instance what does 44.556 represent ? Is it the grand average ? The grand mean is
28.14815. Can
someone help me understand the output of summary.lm ?
Best Regards,
Ashim
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti
ng-
guide.html
and provide commented, minimal, self-contained, reproducible
code.
________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn?
a jsou
ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a
jeho
kopie
vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni
tento
jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou
modifikacemi
?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o
uzav?en?
smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn?
p?ijmout;
Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze
strany
p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl
p?semn?
zmocn?n
nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly
adres?tovi
tohoto
emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo
jejich
existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential
and
are
intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform
its
sender. Delete the contents of this e-mail with all attachments
and its
copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in
any
manner.
The sender of this e-mail shall not be liable for any possible
damage
caused by modifications of the e-mail or by delay with transfer
of the
email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering
into a
contract in any time, for any reason, and without stating any
reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer)
excludes
any acceptance of the offer on the part of the recipient
containing any
amendment or variation. - the sender insists on that the respective contract is concluded
only
upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized
to
enter
into any contracts on behalf of the company except for cases in
which
he/she is expressly authorized to do so in writing, and such
authorization
or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such
authorization is
known to the recipient of the person represented by the recipient.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius Alameda, CA, USA
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius Alameda, CA, USA
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti
ng-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.