That's a great point Tyler. It raises the question of what IS a good reference for statistics that treats them the way R does. There has been some discussion of that already, but one book that hasn't been mentioned is that of John Fox, the author of the car package.
Fox, John. 1997. Applied regression analysis, linear models, and related methods. Sage Publications.
http://books.google.com/books?id=pr2mKvAxXeYC&printsec=frontcover&lr=
Although mainly aimed at the social sciences, I found this to be pretty readable, and much more detailed than Crawley's books (admittedly aimed at a higher level). And as for R code, Fox also has "An R and S-Plus Companion to Applied Regression". http://books.google.com/books?id=xWS8kgRjGcAC&printsec=frontcover&lr=
If you want to get a detailed understanding of Anova and regression the way R sees them, I think this pair of books is nearly as good as it gets.
Matt
-----Original Message-----
From: r-sig-ecology-bounces at r-project.org [mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of tyler
Sent: Thursday, November 13, 2008 8:52 AM
To: r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] ANOVA Output
Apologies if I'm beating a dead horse here, but this is exactly the
problem I raised in the thread on classical statistics in R. If Katrina
is using a textbook like Sokal and Rohlf, it is indeed completely
unexpected to find that changing the order of explanatory variables in
an anova will produce different results. Thierry points out that this is
because R produces Type I SS by default. Unfortunately, nowhere in S&R
is this distinction explained, so for this problem a book widely
regarded as a comprehensive reference for biologists provides absolutely
no help.
These questions come up all the time on the r-help list, and I think
it's a sign of a real disconnect between the presentation of classical
statistics in many undergrad programs and the way the tests are actually
implemented in R.
Anyways, that's a bigger issue. It may be helpful to know that the 'car'
package includes a function Anova (not to be confused with the anova
function) that allows you to calculate type II or type III sums of
squares.
Cheers,
Tyler
"ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>
writes:
Dear Katrina,
The F-value are different because you test different hypotheses since
anova yields Type I SS. It looks like you expect Type III SS.
HTH,
Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-sig-ecology-bounces at r-project.org
[mailto:r-sig-ecology-bounces at r-project.org] Namens Katrina W. Chu
Verzonden: woensdag 12 november 2008 22:27
Aan: r-sig-ecology at r-project.org
Onderwerp: [R-sig-eco] ANOVA Output
I have a question about my R-output when I run a three-way ANOVA. I
just plugged in the
interaction term into the formula and presto! ANOVA! But I noticed
that if I change
the order of the formula (or interaction term), I get slightly different
ANOVA outputs.
I've pasted the output at the bottom of this message. I didn't think
that this should
happen, so I would appreciate if anyone had any feedback on this
problem.
Thanks in advance, Kat.
ANOVA <- aov(Chlorophyll.a~Treatment*SamplingPeriod*Site)
summary(ANOVA)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 3 356.5 118.8 4.2878 0.005276 **
SamplingPeriod 3 374.7 124.9 4.5069 0.003911 **
Site 1 1016.5 1016.5 36.6791 2.629e-09 ***
Treatment:SamplingPeriod 9 467.6 52.0 1.8747 0.053284 .
Treatment:Site 3 167.8 55.9 2.0176 0.110424
SamplingPeriod:Site 3 1670.2 556.7 20.0884 2.383e-12 ***
Treatment:SamplingPeriod:Site 9 277.2 30.8 1.1115 0.352455
Residuals 534 14799.5 27.7
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ANOVA <- aov(Chlorophyll.a~SamplingPeriod*Treatment*Site)
summary(ANOVA)
Df Sum Sq Mean Sq F value Pr(>F)
SamplingPeriod 3 369.5 123.2 4.4437 0.004264 **
Treatment 3 361.8 120.6 4.3510 0.004840 **
Site 1 1016.5 1016.5 36.6791 2.629e-09 ***
SamplingPeriod:Treatment 9 467.6 52.0 1.8747 0.053284 .
SamplingPeriod:Site 3 1662.0 554.0 19.9894 2.718e-12 ***
Treatment:Site 3 176.0 58.7 2.1166 0.097111 .
SamplingPeriod:Treatment:Site 9 277.2 30.8 1.1115 0.352455
Residuals 534 14799.5 27.7
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ANOVA <- aov(Chlorophyll.a~Site*SamplingPeriod*Treatment)
summary(ANOVA)
Df Sum Sq Mean Sq F value Pr(>F)
Site 1 1008.9 1008.9 36.4050 2.998e-09 ***
SamplingPeriod 3 374.1 124.7 4.4990 0.003953 **
Treatment 3 364.8 121.6 4.3871 0.004607 **
Site:SamplingPeriod 3 1654.8 551.6 19.9026 3.050e-12 ***
Site:Treatment 3 172.6 57.5 2.0761 0.102364
SamplingPeriod:Treatment 9 478.2 53.1 1.9172 0.047282 *
Site:SamplingPeriod:Treatment 9 277.2 30.8 1.1115 0.352455
Residuals 534 14799.5 27.7
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--
What is wanted is not the will to believe, but the will to find out, which is
the exact opposite. --Bertrand Russell