Likelihood Ratio tests and fixed effects with LMER

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20081208/c58fbcc8/attachment.pl>
Sorry for sending twice, but I didn't receive any answer. If someone 
could help me please,

David
---------------

Hello,

I previously read this : 
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html
but I still have some questions about my statistical analysis.

Here is my problem :

We study the effect of habitat structure on the spawning behaviour in 
salmon. So we experimentally observed the behaviour of fish (we 
considered 6 types of behaviours) during the spawning season.

Then I want to analyse if the number of behaviour (for each type) we 
observed 1 hour before the spawn is influenced by 2 factors : the 
structure of habitat (2 experimental levels : homogeneous and 
heterogeneous habitat) and the presence of parr (2 experimental levels: 
0 (absence) or 1 (presence)).

As we have got several spawning events by female (because we can not use 
a lot of females), I have to use mixed models, with female as random 
effect (a female is used only in one type of habitat and in presence OR 
absence of parr, ).

 So for each type of behaviour, I want to know if fixed effects 
(habitat, parr and interaction) are significant or not. As previously 
requested, and answered by Mister Bates 
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html), 
I tried to use Likelihood ratio tests to determine if these two factors 
are significant or not. Here are an example about one type of behaviour 
(Fprob = number of probings by female) :

/> Fprob.lmer.full<-lmer(Fprob ~habitat+parr+habitat: parr 
+(1|female),family=poisson)
 > Fprob.lmer.add<-lmer(Fprob ~habitat+ parr +(1|female),family=poisson)
 > Fprob.lmer.hab<-lmer(Fprob ~habitat+(1|female),family=poisson)
 > Fprob.lmer.parr<-lmer(Fprob ~ parr +(1|female),family=poisson)
 > Fprob.lmer.null<-lmer(Fprob ~1+(1|female),family=poisson)

 > anova(Fprob.lmer.full,Fprob.lmer.add)
Data:
Models:
Fprob.lmer.add: Fprob ~ habitat + parr + (1 | female)
Fprob.lmer.full: Fprob ~ habitat + parr + habitat:parr + (1 | female)
                Df     AIC     BIC  logLik Chisq Chi Df Pr(>Chisq)
Fprob.lmer.add   4  319.71  326.57 -155.86                       
Fprob.lmer.full  5  321.17  329.74 -155.58 0.545      1     0.4604

 > anova(Fprob.lmer.hab,Fprob.lmer.null)
Data:
Models:
Fprob.lmer.null: Fprob ~ 1 + (1 | female)
Fprob.lmer.hab: Fprob ~ habitat + (1 | female)
                Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
Fprob.lmer.null  2  319.11  322.54 -157.55                        
Fprob.lmer.hab   3  320.12  325.26 -157.06 0.9866      1     0.3206

 > anova(Fprob.lmer.parr,Fprob.lmer.null)
Data:
Models:
Fprob.lmer.null: Fprob ~ 1 + (1 | female)
Fprob.lmer.parr: Fprob ~ parr + (1 | female)
                Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
Fprob.lmer.null  2  319.11  322.54 -157.55                        
Fprob.lmer.parr  3  319.07  324.21 -156.54 2.0372      1     0.1535/

So, I considered as LR test for each effect :

Habitat : Chi2 = 0.9866; df = 1; p = 0.3206
Parr : Chi2 = 2.0372; df = 1; p = 0.1535
Interaction Habitat :parr : Chi2 = 0.545; df = 1; p = 0.4604

I first would like to know if I am wrong, or if I correctly analysed my 
data ?

Second, I guess that LMER function is optimizing REML by default (if I 
correctly read the help file), but I had understood that we need to 
optimize ML to compare fixed effects (from  Pinheiro J C & Bates D M, 
"Mixed-effects models in S and S-PLUS"). If right, what do I need to 
change to correctly analyzed my data with Likelihood ratio tests ?

I am finishing my PhD, and I have to finish to correctly analyze 
behavioural data (with Poisson distribution) and genetic data as well 
(binomial data) before publishing.
Thank you for all the help you can bring to me here.

Best regards,

David Grimardias
-------------------------------------
David GRIMARDIAS
Doctorant
UMR INRA-UPPA ECOBIOP
Quartier Ibarron
64310 SAINT PEe SUR NIVELLE

T?l.: 0559515979
Fax: 0559545152
mail: David.Grimardias at st-pee.inra.fr
http://www.st-pee.inra.fr

So for each type of behaviour, I want to know if fixed effects (habitat, parr 
and interaction) are significant or not. As previously requested, and 
answered by Mister Bates 
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html), I 
tried to use Likelihood ratio tests to determine if these two factors are 
significant or not. Here are an example about one type of behaviour (Fprob = 
number of probings by female) :

Fprob.lmer.full<-lmer(Fprob ~habitat+parr+habitat:parr +(1|female),family=poisson)
Fprob.lmer.add<-lmer(Fprob ~habitat+ parr +(1|female),family=poisson)
Fprob.lmer.hab<-lmer(Fprob ~habitat+(1|female),family=poisson)
Fprob.lmer.parr<-lmer(Fprob ~ parr +(1|female),family=poisson)
Fprob.lmer.null<-lmer(Fprob ~1+(1|female),family=poisson)
So, I considered as LR test for each effect :

Habitat : Chi2 = 0.9866; df = 1; p = 0.3206
Parr : Chi2 = 2.0372; df = 1; p = 0.1535
Interaction Habitat :parr : Chi2 = 0.545; df = 1; p = 0.4604

I first would like to know if I am wrong, or if I correctly analysed my data?

Yes, though there has been some discussion on this list about the adequacy of
the chi-square approximation for the LRTS in these models.

You should also compare the results to those from simple models ignoring
the fact that females may appear more than once.  They should be broadly
similar.  How large is the variance due to the random effects anyway?
Second, I guess that LMER function is optimizing REML by default (if I 
correctly read the help file), but I had understood that we need to optimize 
ML to compare fixed effects (from  Pinheiro J C & Bates D M, "Mixed-effects 
models in S and S-PLUS"). If right, what do I need to change to correctly 
analyzed my data with Likelihood ratio tests ?
REML is not done for GLMMs IIRC.

David Duffy.
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v
Hello,

I am quite new to all this approach, but since it's somewhat different  
from what I've seen in classes and I have to rely on what I have been  
learning by myself and on lists such as this one, it's easy to get  
confused. Anyway, I have seen in several textbooks which take this  
anova table / LRT approach the opposite "direction" of effect testing:  
starting with the full model, removing terms (interactions first, then  
according to effect size, for example) and comparing to the previous  
one.

So, the example would go something like this:

Fprob.lmer.full x Fprob.lmer.add to test the interaction term
Fprob.lmer.add x Fprob.lmer.parr (just a guess, judging for the AIC  
value) to test for the habitat effect
Fprob.lmer.parr x Fprob.lmer.null to test for the parr effect

Only proceeding to the next step if the LRT is nonsignificant. If  
changes to the model are significant, individual terms which haven't  
been tested yet would be tested by removing them and comparing to that  
model.

In this case, only the habitat effect could have a different p value,  
but I can imagine when you are comparing models with more terms that  
results could differ. I've even seen cases in which terms are  
sequentially removed as long as they are nonsignificant, point in  
which single terms previously removed are "re-added" individually to  
see if there is improvement to the model.

This may even be kind of "off topic" for this list, but since  
considerable discussion has been going on here about hypothesis  
testing and LRT previously, and the question was originally asked  
here, I thought it wouldn't hurt to continue the discussion...
You should also compare the results to those from simple models  
ignoring
the fact that females may appear more than once.  They should be  
broadly
similar.  How large is the variance due to the random effects anyway?
There was some discussion here previously about how it wouldn't be  
good to compare likelihoods of GLMM and GLM, because they are  
generated differently or something like that... Could this be done by  
generating a dummy identity variable with no repetitions (simulating  
non-repeating individual identities), using it as a random variable,  
and comparing the models? I don't know what this could do to the  
estimation of parameters and so on...

Many thanks in advance, and sorry for anything!

Abra?os,
Rafael Maia
---
"A little learning is a dangerous thing; drink deep, or taste not the  
Pierian spring." (A. Pope)
Animal Behavior Lab
Dept. of Zoology, Universidade de Brasilia
http://www.unb.br/ib/zoo/rhmacedo/

On Tue, 16 Dec 2008, David Grimardias wrote:

So for each type of behaviour, I want to know if fixed effects  
(habitat, parr and interaction) are significant or not. As  
previously requested, and answered by Mister Bates (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html 
), I tried to use Likelihood ratio tests to determine if these two  
factors are significant or not. Here are an example about one type  
of behaviour (Fprob = number of probings by female) :

Fprob.lmer.full<-lmer(Fprob ~habitat+parr+habitat:parr +(1| 
female),family=poisson)
Fprob.lmer.add<-lmer(Fprob ~habitat+ parr +(1| 
female),family=poisson)
Fprob.lmer.hab<-lmer(Fprob ~habitat+(1|female),family=poisson)
Fprob.lmer.parr<-lmer(Fprob ~ parr +(1|female),family=poisson)
Fprob.lmer.null<-lmer(Fprob ~1+(1|female),family=poisson)
So, I considered as LR test for each effect :

Habitat : Chi2 = 0.9866; df = 1; p = 0.3206
Parr : Chi2 = 2.0372; df = 1; p = 0.1535
Interaction Habitat :parr : Chi2 = 0.545; df = 1; p = 0.4604

I first would like to know if I am wrong, or if I correctly  
analysed my data?

Yes, though there has been some discussion on this list about the  
adequacy of
the chi-square approximation for the LRTS in these models.

You should also compare the results to those from simple models  
ignoring
the fact that females may appear more than once.  They should be  
broadly
similar.  How large is the variance due to the random effects anyway?

Second, I guess that LMER function is optimizing REML by default  
(if I correctly read the help file), but I had understood that we  
need to optimize ML to compare fixed effects (from  Pinheiro J C &  
Bates D M, "Mixed-effects models in S and S-PLUS"). If right, what  
do I need to change to correctly analyzed my data with Likelihood  
ratio tests ?
REML is not done for GLMMs IIRC.

David Duffy.

-- 
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax:  
-0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research    
\_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Hello,

I am quite new to all this approach, but since it's somewhat different from 
what I've seen in classes and I have to rely on what I have been learning by 
myself and on lists such as this one, it's easy to get confused. Anyway, I 
have seen in several textbooks which take this anova table / LRT approach the 
opposite "direction" of effect testing: starting with the full model, 
removing terms (interactions first, then according to effect size, for 
example) and comparing to the previous one.

This may even be kind of "off topic" for this list, but since considerable 
discussion has been going on here about hypothesis testing and LRT 
previously, and the question was originally asked here, I thought it wouldn't 
hurt to continue the discussion...
It's a generic model building question.  If you have large numbers of 
variables, fitting the all K-way interactions model as a base model can be 
expensive, so people try to build upwards from simpler models.  I quite 
like the idea of Bayesian Model Averaging...
There was some discussion here previously about how it wouldn't be good to 
compare likelihoods of GLMM and GLM, because they are generated differently 
or something like that...
That was just whether particular constants are included or excluded in the 
likelihood expressions used in the different codes.  My comment was 
to the effect that the parameter estimates and LRTs from simple fixed 
effects models are a reality check for the results coming from the GLMMs, 
especially if you are worried that something isn't working properly.

David Duffy.
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v