Skip to content

How to get Greenhouse-Geisser epsilons from anova?

15 messages · Nils Skotara, Skotara, Peter Dalgaard +1 more

#
Dear all,

I apologize for my basic question.
I try to calculate an anova for repeated measurements with 3 factors 
(A,B,C) having 2, 2, and 7 levels.
or with an additional fourth between subjects factor D.
Everything works fine using
aov(val ~ A*B*C  + Error(subject/ (A*B*C) ) )  or
aov(val ~ (D*A*B*C)  + Error(subject/(A*B*C)) + D )
val, A, B, C, D and subject are columns in a data.frame.

How can I get the estimated Greenhouse-Geisser and Huynh-Feldt epsilons?
I know Peter Dalgaard described it in R-News Vol. 7/2, October 2007. 
However, unfortunately I am not able to apply that using my data...

Furthermore, I am still confused of how SPSS calculates the epsilons 
since it is mentioned that perhaps there are any errors in SPSS??

I would be glad if anyone could help me!
I am looking forward to hearing from you!

Thank you!
Nils
#
Skotara wrote:
Why? It is supposed to work. You just need to work out the X and M 
specification for the relevant error strata and set test="Spherical" for 
anova.mlm, or work out the T contrast matrix explicitly if that suits 
your temper better.

  
    
#
Dear Mr. Daalgard.

thank you very much for your reply, it helped me to progress a bit.

The following works fine:
dd <- expand.grid(C = 1:7, B= c("r", "l"), A= c("c", "f"))
myma <- as.matrix(myma) #myma is a 12 by 28 list
mlmfit <- lm(myma~1)
mlmfit0 <- update(mlmfit, ~0)
anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd,  
test="Spherical"), which tests the main effect of A.
anova(mlmfit, mlmfit0, X= ~A+C,  M = ~A+C+B, idata = dd,  
test="Spherical"), which tests the main effect of B.


However, I can not figure out how this works for the other effects.
If I try:
anova(mlmfit, mlmfit0, X= ~A+B,  M = ~A+C+B, idata = dd,  test="Spherical")

I get:
Fehler in function (object, ..., test = c("Pillai", "Wilks", 
"Hotelling-Lawley",  :
        residuals have rank 1 < 4

I also don't know how I can calculate the various interactions..
My read is I should change the second argument mlmfit0, too, but I can't 
figure out how...

Do you know what to do?
Thank you very much!



Peter Dalgaard schrieb:
#
Dear Nils,

You might also take a look at the Anova() function in the car package, which
though less flexible than anova() should get you the tests that you want
more simply. ?Anova has an example of a repeated-measures ANOVA with two
within-subject and two between-subject factors.

I hope that this helps,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
On
test="Spherical")
epsilons?
http://www.R-project.org/posting-guide.html
#
Skotara wrote:
dd$C is not a factor with that construction. It works for me after

dd$C <- factor(dd$C)

(The other message is nasty, though. It's slightly different in R-patched:

 > anova(mlmfit, mlmfit0, X= ~A+B, M = ~A+C+B, idata = dd, 
test="Spherical")
Error in solve.default(Psi, B) :
   system is computationally singular: reciprocal condition number = 
2.17955e-34

but it shouldn't happen...
Looks like it is a failure of the internal Thin.row function. Ick!
)
The "within" interactions should be straightforward, e.g.

M=~A*B*C
X=~A*B*C-A:B:C

etc.

The within/between interactions are otained from the similar tests of 
the between factor(s)

e.g.

mlmfitD <- lm(myma~D)

and then

anova(mlmfitD, mlmfit,....)

  
    
#
Thank you, this helped me a lot!
All within effects and interactions work well!

Sorry, but I still could not get how to include the between factor..
If I include D with 2 levels, then myma is 24 by 28. (another 12 by 28 for the
second group of subjects.)
mlmfitD <- lm(myma~D) is no problem, 
but whatever I tried afterwards did not seem logical to me.
I am afraid I do not understand how to include the between factor. 
I cannot include ~D into M or X because it has length 24 whereas the other
factors have 28... 


Zitat von Peter Dalgaard <p.dalgaard at biostat.ku.dk>:
#
Nils Skotara wrote:
Just do the same as before, but comparing mlmfitD to mlmfit:

anova(mlmfitD, mlmfit, X=~A+B, M=~A+B+C)
# or anova(mlmfitD, mlmfit, X=~1, M=~C), as long as things are balanced


gives the D:C interaction test (by testing whether the C contrasts 
depend on D). The four-factor interaction is

anova(mlmfitD, mlmfit, X=~(A+B+C)^2, M=~A*B*C)

  
    
1 day later
#
Thank you for your help!
Sorry, for bothering you again..
I still have trouble combining within and between subject factors.
Interactions of within factors and D having only 2 levels work well.

How can I get the main effect of D? I have tried anova(mlmfitD, mlmfit). 
With D having 3 levels I would expect the dfs to be 2 and 33. However, 
the output states 84,24??

As long as the between factor has only 2 levels the between/within 
interactions fit well with SPSS, but if D has 3 levels, the mismatch is 
immense.
If I calculate the within effects with myma having  not 12 subjects from 
one group but for example 24 from 2 groups, the output treats it as if 
all subjects came from the same group, for example for main effect A the 
dfs are 1 and 35. SPSS puts out 1 and 33 which is what I would have 
expected.. ..



Peter Dalgaard schrieb:
#
Skotara wrote:
That's not a main effect, it's a simultaneous test of all 28 components. 
  What you need is an analysis of the average of all 
within-measurements. In principle that is obtained by X=~0, M=~1 or 
T=matrix(1,1,28) but there's a bug that prevents it from working (fixed 
in R-patched a couple of days ago).
Hmm, there's a generic problem in that you can't get some of the 
traditional ANOVA table F tests by comparing two models, and in your 
case, SPSS is de facto using the residuals from a model with A:D 
interaction when testing for A. It might help if you try

anova(mlmfitD, X=~..., M=~...)

Look at the (Intercept) line.

  
    
#
Dear Peter and Nils,

I hesitate to repeat this (though I'm going to do it anyway!), but it's
quite simple to get these tests from Anova() in the car package. Here's an
example from ?Anova of a repeated-measures ANOVA with two within and two
between-subject factors:

-------------- snip ---------------

Anova> ## a multivariate linear model for repeated-measures data
Anova> ## See ?OBrienKaiser for a description of the data set used in this
example.
Anova> 
Anova> phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5,
5)),
Anova+     levels=c("pretest", "posttest", "followup"))

Anova> hour <- ordered(rep(1:5, 3))

Anova> idata <- data.frame(phase, hour)

Anova> idata
      phase hour
1   pretest    1
2   pretest    2
3   pretest    3
4   pretest    4
5   pretest    5
6  posttest    1
7  posttest    2
8  posttest    3
9  posttest    4
10 posttest    5
11 followup    1
12 followup    2
13 followup    3
14 followup    4
15 followup    5

Anova> mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, 
Anova+                      post.1, post.2, post.3, post.4, post.5, 
Anova+                      fup.1, fup.2, fup.3, fup.4, fup.5) ~
treatment*gender, 
Anova+                 data=OBrienKaiser)

Anova> (av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour)) 

Type II Repeated Measures MANOVA Tests: Pillai test statistic
                            Df test stat approx F num Df den Df    Pr(>F)

treatment                    2    0.4809   4.6323      2     10 0.0376868 *

gender                       1    0.2036   2.5558      1     10 0.1409735

treatment:gender             2    0.3635   2.8555      2     10 0.1044692

phase                        1    0.8505  25.6053      2      9 0.0001930
***
treatment:phase              2    0.6852   2.6056      4     20 0.0667354 .

gender:phase                 1    0.0431   0.2029      2      9 0.8199968

treatment:gender:phase       2    0.3106   0.9193      4     20 0.4721498

hour                         1    0.9347  25.0401      4      7 0.0003043
***
treatment:hour               2    0.3014   0.3549      8     16 0.9295212

gender:hour                  1    0.2927   0.7243      4      7 0.6023742

treatment:gender:hour        2    0.5702   0.7976      8     16 0.6131884

phase:hour                   1    0.5496   0.4576      8      3 0.8324517

treatment:phase:hour         2    0.6637   0.2483     16      8 0.9914415

gender:phase:hour            1    0.6950   0.8547      8      3 0.6202076

treatment:gender:phase:hour  2    0.7928   0.3283     16      8 0.9723693

---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Anova> summary(av.ok, multivariate=FALSE)

Univariate Type II Repeated-Measures ANOVA Assuming Sphericity

                                 SS num Df Error SS den Df       F    Pr(>F)

treatment                   211.286      2  228.056     10  4.6323  0.037687
*  
gender                       58.286      1  228.056     10  2.5558  0.140974

treatment:gender            130.241      2  228.056     10  2.8555  0.104469

phase                       167.500      2   80.278     20 20.8651 1.274e-05
***
treatment:phase              78.668      4   80.278     20  4.8997  0.006426
** 
gender:phase                  1.668      2   80.278     20  0.2078  0.814130

treatment:gender:phase       10.221      4   80.278     20  0.6366  0.642369

hour                        106.292      4   62.500     40 17.0067 3.191e-08
***
treatment:hour                1.161      8   62.500     40  0.0929  0.999257

gender:hour                   2.559      4   62.500     40  0.4094  0.800772

treatment:gender:hour         7.755      8   62.500     40  0.6204  0.755484

phase:hour                   11.083      8   96.167     80  1.1525  0.338317

treatment:phase:hour          6.262     16   96.167     80  0.3256  0.992814

gender:phase:hour             6.636      8   96.167     80  0.6900  0.699124

treatment:gender:phase:hour  14.155     16   96.167     80  0.7359  0.749562

---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 


Mauchly Tests for Sphericity

                            Test statistic p-value
phase                              0.74927 0.27282
treatment:phase                    0.74927 0.27282
gender:phase                       0.74927 0.27282
treatment:gender:phase             0.74927 0.27282
hour                               0.06607 0.00760
treatment:hour                     0.06607 0.00760
gender:hour                        0.06607 0.00760
treatment:gender:hour              0.06607 0.00760
phase:hour                         0.00478 0.44939
treatment:phase:hour               0.00478 0.44939
gender:phase:hour                  0.00478 0.44939
treatment:gender:phase:hour        0.00478 0.44939


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

                             GG eps Pr(>F[GG])    
phase                       0.79953  7.323e-05 ***
treatment:phase             0.79953    0.01223 *  
gender:phase                0.79953    0.76616    
treatment:gender:phase      0.79953    0.61162    
hour                        0.46028  8.741e-05 ***
treatment:hour              0.46028    0.97879    
gender:hour                 0.46028    0.65346    
treatment:gender:hour       0.46028    0.64136    
phase:hour                  0.44950    0.34573    
treatment:phase:hour        0.44950    0.94019    
gender:phase:hour           0.44950    0.58903    
treatment:gender:phase:hour 0.44950    0.64634    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

                             HF eps Pr(>F[HF])    
phase                       0.92786  2.388e-05 ***
treatment:phase             0.92786    0.00809 ** 
gender:phase                0.92786    0.79845    
treatment:gender:phase      0.92786    0.63200    
hour                        0.55928  2.014e-05 ***
treatment:hour              0.55928    0.98877    
gender:hour                 0.55928    0.69115    
treatment:gender:hour       0.55928    0.66930    
phase:hour                  0.73306    0.34405    
treatment:phase:hour        0.73306    0.98047    
gender:phase:hour           0.73306    0.65524    
treatment:gender:phase:hour 0.73306    0.70801    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

-------------- snip ---------------

The default is so-called "Type-II" tests, but with proper contrast coding
one can get so-called "Type-III" tests as well.

Regards,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
On
other
http://www.R-project.org/posting-guide.html
#
Dear John and Peter,

thank you both very much for your help!
Everything works fine now!

John, Anova also works very fine. Thank you very much!
However, if I had more than 2 levels for the between factor the same 
thing as mentioned occured.
The degrees of freedom showed that Anova calculated it as if all 
subjects came from the same group, for example for main effect A the dfs 
are 1 and 35.
Since I can get those values using anova that causes no problem.

I saw that the x$G to get the greenhouse-geisser epsilon do work for:
x<- anova(mlmfitD, X=~C+B, M=~A+C+B, test = "Spherical")
but does not work for y$G:
y <- anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = 
dd,test="Spherical")

Finally, the Greenhouse-Geisser epsilons are identical using both 
methods and to the SPSS output.
The Huynh-Feldt are not the same as them of SPSS. I will use GG instead.
#
Dear Nils,
That is odd -- the example that I sent had a factor with 3 levels, producing
2 df; here's a simplified version with s single between-subject factor
(treatment):
[1] control control control control control A       A       A       A      
[10] B       B       B       B       B       B       B      
attr(,"contrasts")
        [,1] [,2]
control   -2    0
A          1   -1
B          1    1
Levels: control A B
+                       post.1, post.2, post.3, post.4, post.5, 
+                       fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment, 
+                data=OBrienKaiser)
Univariate Type II Repeated-Measures ANOVA Assuming Sphericity

                         SS num Df Error SS den Df       F    Pr(>F)    
treatment            186.75      2   416.58     13  2.9139  0.090041 .  
phase                167.50      2    92.17     26 23.6257 1.419e-06 ***
treatment:phase       77.00      4    92.17     26  5.4304  0.002578 ** 
hour                 106.29      4    72.81     52 18.9769 1.128e-09 ***
treatment:hour         0.89      8    72.81     52  0.0798  0.999597    
phase:hour            11.08      8   116.96    104  1.2319  0.287947    
treatment:phase:hour   5.96     16   116.96    104  0.3312  0.992574    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 


Mauchly Tests for Sphericity

                     Test statistic p-value
phase                       0.85151 0.38119
treatment:phase             0.85151 0.38119
hour                        0.09859 0.00194
treatment:hour              0.09859 0.00194
phase:hour                  0.00837 0.09038
treatment:phase:hour        0.00837 0.09038


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

                      GG eps Pr(>F[GG])    
phase                0.87071  5.665e-06 ***
treatment:phase      0.87071   0.004301 ** 
hour                 0.48867  1.016e-05 ***
treatment:hour       0.48867   0.986835    
phase:hour           0.50283   0.308683    
treatment:phase:hour 0.50283   0.950677    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

                      HF eps Pr(>F[HF])    
phase                0.99390  1.515e-06 ***
treatment:phase      0.99390   0.002641 ** 
hour                 0.57413  2.190e-06 ***
treatment:hour       0.57413   0.992771    
phase:hour           0.75630   0.298926    
treatment:phase:hour 0.75630   0.981607    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

I didn't save your earlier postings, so can't check your model and data. If
you like, please feel free to send them to me privately.

Regards,
 John
#
Skotara wrote:
Do yourself a favour and check what the actual names are:

foo$"G-G Pr"

works in both forms. Notice that these are the epsilon-corrected 
p-values, and not the epsilons themselves.
If they differ by a factor of N/(f+1) where f is the number of df for 
the SSD matrix (alias N-k if all you have is a grouping into k groups), 
then SPSS has the same bug as SAS. (This is also mentioned in the R news 
paper).
#
Dear John,
thank you for the kind offer!
Sorry, I just made a mistake anywhere I can not trace back, now it works 
as you described it. Thank you again!

Dear Peter,
thank you for the information, I did not know about the quotation marks. 
It indeed works using "G-G Pr"!
The SPSS and R output for the epsilons differ exactly by N/(N-(k-1)). So 
I think it must be the mentioned bug.

I wish you merry christmas!
#
Dear Nils,
On
There are often many ways of doing something right, and infinitely many of
getting it wrong, so without specific information it's hard to know what
happened. Anyway, I'm glad that things are now working normally.

Regards,
 John
http://www.R-project.org/posting-guide.html