Skip to content

Contrast anova multi factor

5 messages · Thierry Onkelinx, Mario José Marques-Azevedo, Peter Dalgaard

#
Hi all,

I am doing anova multi factor and I found different Intercept when model
has interaction term.

I have the follow data:

set.seed(42)
dt <- data.frame(f1=c(rep("a",5),rep("b",5)),
                 f2=rep(c("I","II"),5),
                 y=rnorm(10))

When I run

summary.lm(aov(y ~ f1 * f2, data = dt))

The Intercept term is the mean of first level of f1 and f2. I can confirm
that with:

tapply(dt$y, list(dt$f1, dt$f2), mean)

I know that others terms are difference of levels with Intercept.

But I do not know what is Intercept when the model do not have interaction
term:

summary.lm(aov(y ~f1 + f2, data = dt))

I know that I can create a specific contrast table, by I would like
understand the default R output.

I read contrast sub-chapter on Crawley 2012 (The R book) and in his example
the Intercept is different when model has or not interaction term, but he
explain that Intercept is mean of first level of the factors.

Best regards,

Mario

.............................................................
Mario Jos? Marques-Azevedo
Ph.D. Candidate in Ecology
Dept. Plant Biology, Institute of Biology
University of Campinas - UNICAMP
Campinas, S?o Paulo, Brazil
#
Dear Mario,

The interpretation is the same: the average at the reference situation
which is the group that has f1 == "f1 level1" and f2 == "f2 level1".

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-04-26 17:12 GMT+02:00 Mario Jos? Marques-Azevedo <mariojmaaz at gmail.com>
:

  
  
#
?Dear Thierry,

That is the problem. I read that interpretation is the same, but the
Intercept value of summary is different:

The mean of level "a" of f1 and level "I" of f2 (first level of each
factor) is 0.7127851.

When I run model with interaction term:

summary.lm(aov(y~f1*f2,data=dt))

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.7128     0.2884   2.471   0.0484 *
f1b           1.0522     0.4560   2.307   0.0605 .
f2II         -0.6787     0.4560  -1.488   0.1872
f1b:f2II     -1.1741     0.6449  -1.821   0.1185

I check that Intercept is mean of level "a" of f1 and level "I" of f2.

But when I run the model without interaction term, the Intercept value is
different:

summary.lm(aov(y~f1+f2,data=dt))

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.9476     0.2976   3.185   0.0154 *
f1b           0.4651     0.3720   1.251   0.2513
f2II         -1.2658     0.3720  -3.403   0.0114 *

I do not know what is Intercept value in this case. I expected that it is
mean of level "a" of f1 and level "I" of f2, but not.

Best regards,

Mario


On 26 April 2015 at 12:30, Thierry Onkelinx <thierry.onkelinx at inbo.be>
wrote:

  
  
#
The parameter is different because the model without intercept assumes that
effect of f1 is independent on the effect of f2. So you force f1b:f2ll to
be 0.

The interpretation is the same. The fit is conditional on the model
(interaction or no interaction).

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-04-26 17:40 GMT+02:00 Mario Jos? Marques-Azevedo <mariojmaaz at gmail.com>
:

  
  
#
A little more precisely: It is the estimate of the expected value at the reference situation. 

In a balanced two-way design, this can be worked out explicitly: It is the average of the first row + the average of the first column - the total average. E.g.
Call:
lm(formula = hr ~ subj + time, data = heart.rate)

Coefficients:
(Intercept)        subj2        subj3        subj4        subj5        subj6  
     94.917       18.000       -5.750       -8.000       30.500        6.500  
      subj7        subj8        subj9       time30       time60      time120  
    -22.000      -16.000       11.500       -4.000       -5.444       -4.222
1      2      3      4      5      6      7      8      9 
 91.50 109.50  85.75  83.50 122.00  98.00  69.50  75.50 103.00
0       30       60      120 
96.55556 92.55556 91.11111 92.33333
[1] 93.13889
[1] 94.91667

In an unbalanced design, the calculation of the intercept gets a bit lost in matrix-calculus land; there is no simple formula, but it is still an estimate of the same thing. 

- Peter D