An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20081208/c58fbcc8/attachment.pl>
Likelihood Ratio tests and fixed effects with LMER
5 messages · David Grimardias, Rafael Maia, David Duffy
8 days later
Sorry for sending twice, but I didn't receive any answer. If someone could help me please, David --------------- Hello, I previously read this : https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html but I still have some questions about my statistical analysis. Here is my problem : We study the effect of habitat structure on the spawning behaviour in salmon. So we experimentally observed the behaviour of fish (we considered 6 types of behaviours) during the spawning season. Then I want to analyse if the number of behaviour (for each type) we observed 1 hour before the spawn is influenced by 2 factors : the structure of habitat (2 experimental levels : homogeneous and heterogeneous habitat) and the presence of parr (2 experimental levels: 0 (absence) or 1 (presence)). As we have got several spawning events by female (because we can not use a lot of females), I have to use mixed models, with female as random effect (a female is used only in one type of habitat and in presence OR absence of parr, ). So for each type of behaviour, I want to know if fixed effects (habitat, parr and interaction) are significant or not. As previously requested, and answered by Mister Bates (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html), I tried to use Likelihood ratio tests to determine if these two factors are significant or not. Here are an example about one type of behaviour (Fprob = number of probings by female) : /> Fprob.lmer.full<-lmer(Fprob ~habitat+parr+habitat: parr +(1|female),family=poisson) > Fprob.lmer.add<-lmer(Fprob ~habitat+ parr +(1|female),family=poisson) > Fprob.lmer.hab<-lmer(Fprob ~habitat+(1|female),family=poisson) > Fprob.lmer.parr<-lmer(Fprob ~ parr +(1|female),family=poisson) > Fprob.lmer.null<-lmer(Fprob ~1+(1|female),family=poisson) > anova(Fprob.lmer.full,Fprob.lmer.add) Data: Models: Fprob.lmer.add: Fprob ~ habitat + parr + (1 | female) Fprob.lmer.full: Fprob ~ habitat + parr + habitat:parr + (1 | female) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) Fprob.lmer.add 4 319.71 326.57 -155.86 Fprob.lmer.full 5 321.17 329.74 -155.58 0.545 1 0.4604 > anova(Fprob.lmer.hab,Fprob.lmer.null) Data: Models: Fprob.lmer.null: Fprob ~ 1 + (1 | female) Fprob.lmer.hab: Fprob ~ habitat + (1 | female) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) Fprob.lmer.null 2 319.11 322.54 -157.55 Fprob.lmer.hab 3 320.12 325.26 -157.06 0.9866 1 0.3206 > anova(Fprob.lmer.parr,Fprob.lmer.null) Data: Models: Fprob.lmer.null: Fprob ~ 1 + (1 | female) Fprob.lmer.parr: Fprob ~ parr + (1 | female) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) Fprob.lmer.null 2 319.11 322.54 -157.55 Fprob.lmer.parr 3 319.07 324.21 -156.54 2.0372 1 0.1535/ So, I considered as LR test for each effect : Habitat : Chi2 = 0.9866; df = 1; p = 0.3206 Parr : Chi2 = 2.0372; df = 1; p = 0.1535 Interaction Habitat :parr : Chi2 = 0.545; df = 1; p = 0.4604 I first would like to know if I am wrong, or if I correctly analysed my data ? Second, I guess that LMER function is optimizing REML by default (if I correctly read the help file), but I had understood that we need to optimize ML to compare fixed effects (from Pinheiro J C & Bates D M, "Mixed-effects models in S and S-PLUS"). If right, what do I need to change to correctly analyzed my data with Likelihood ratio tests ? I am finishing my PhD, and I have to finish to correctly analyze behavioural data (with Poisson distribution) and genetic data as well (binomial data) before publishing. Thank you for all the help you can bring to me here. Best regards, David Grimardias
------------------------------------- David GRIMARDIAS Doctorant UMR INRA-UPPA ECOBIOP Quartier Ibarron 64310 SAINT PEe SUR NIVELLE T?l.: 0559515979 Fax: 0559545152 mail: David.Grimardias at st-pee.inra.fr http://www.st-pee.inra.fr
On Tue, 16 Dec 2008, David Grimardias wrote:
So for each type of behaviour, I want to know if fixed effects (habitat, parr and interaction) are significant or not. As previously requested, and answered by Mister Bates (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html), I tried to use Likelihood ratio tests to determine if these two factors are significant or not. Here are an example about one type of behaviour (Fprob = number of probings by female) :
Fprob.lmer.full<-lmer(Fprob ~habitat+parr+habitat:parr +(1|female),family=poisson) Fprob.lmer.add<-lmer(Fprob ~habitat+ parr +(1|female),family=poisson) Fprob.lmer.hab<-lmer(Fprob ~habitat+(1|female),family=poisson) Fprob.lmer.parr<-lmer(Fprob ~ parr +(1|female),family=poisson) Fprob.lmer.null<-lmer(Fprob ~1+(1|female),family=poisson)
So, I considered as LR test for each effect : Habitat : Chi2 = 0.9866; df = 1; p = 0.3206 Parr : Chi2 = 2.0372; df = 1; p = 0.1535 Interaction Habitat :parr : Chi2 = 0.545; df = 1; p = 0.4604 I first would like to know if I am wrong, or if I correctly analysed my data?
Yes, though there has been some discussion on this list about the adequacy of the chi-square approximation for the LRTS in these models. You should also compare the results to those from simple models ignoring the fact that females may appear more than once. They should be broadly similar. How large is the variance due to the random effects anyway?
Second, I guess that LMER function is optimizing REML by default (if I correctly read the help file), but I had understood that we need to optimize ML to compare fixed effects (from Pinheiro J C & Bates D M, "Mixed-effects models in S and S-PLUS"). If right, what do I need to change to correctly analyzed my data with Likelihood ratio tests ?
REML is not done for GLMMs IIRC. David Duffy.
| David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
Hello, I am quite new to all this approach, but since it's somewhat different from what I've seen in classes and I have to rely on what I have been learning by myself and on lists such as this one, it's easy to get confused. Anyway, I have seen in several textbooks which take this anova table / LRT approach the opposite "direction" of effect testing: starting with the full model, removing terms (interactions first, then according to effect size, for example) and comparing to the previous one. So, the example would go something like this: Fprob.lmer.full x Fprob.lmer.add to test the interaction term Fprob.lmer.add x Fprob.lmer.parr (just a guess, judging for the AIC value) to test for the habitat effect Fprob.lmer.parr x Fprob.lmer.null to test for the parr effect Only proceeding to the next step if the LRT is nonsignificant. If changes to the model are significant, individual terms which haven't been tested yet would be tested by removing them and comparing to that model. In this case, only the habitat effect could have a different p value, but I can imagine when you are comparing models with more terms that results could differ. I've even seen cases in which terms are sequentially removed as long as they are nonsignificant, point in which single terms previously removed are "re-added" individually to see if there is improvement to the model. This may even be kind of "off topic" for this list, but since considerable discussion has been going on here about hypothesis testing and LRT previously, and the question was originally asked here, I thought it wouldn't hurt to continue the discussion...
You should also compare the results to those from simple models ignoring the fact that females may appear more than once. They should be broadly similar. How large is the variance due to the random effects anyway?
There was some discussion here previously about how it wouldn't be good to compare likelihoods of GLMM and GLM, because they are generated differently or something like that... Could this be done by generating a dummy identity variable with no repetitions (simulating non-repeating individual identities), using it as a random variable, and comparing the models? I don't know what this could do to the estimation of parameters and so on... Many thanks in advance, and sorry for anything! Abra?os, Rafael Maia --- "A little learning is a dangerous thing; drink deep, or taste not the Pierian spring." (A. Pope) Animal Behavior Lab Dept. of Zoology, Universidade de Brasilia http://www.unb.br/ib/zoo/rhmacedo/
On 16 Dec 2008, at 20:03, David Duffy wrote:
On Tue, 16 Dec 2008, David Grimardias wrote:
So for each type of behaviour, I want to know if fixed effects (habitat, parr and interaction) are significant or not. As previously requested, and answered by Mister Bates (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001458.html ), I tried to use Likelihood ratio tests to determine if these two factors are significant or not. Here are an example about one type of behaviour (Fprob = number of probings by female) :
Fprob.lmer.full<-lmer(Fprob ~habitat+parr+habitat:parr +(1| female),family=poisson) Fprob.lmer.add<-lmer(Fprob ~habitat+ parr +(1| female),family=poisson) Fprob.lmer.hab<-lmer(Fprob ~habitat+(1|female),family=poisson) Fprob.lmer.parr<-lmer(Fprob ~ parr +(1|female),family=poisson) Fprob.lmer.null<-lmer(Fprob ~1+(1|female),family=poisson)
So, I considered as LR test for each effect : Habitat : Chi2 = 0.9866; df = 1; p = 0.3206 Parr : Chi2 = 2.0372; df = 1; p = 0.1535 Interaction Habitat :parr : Chi2 = 0.545; df = 1; p = 0.4604 I first would like to know if I am wrong, or if I correctly analysed my data?
Yes, though there has been some discussion on this list about the adequacy of the chi-square approximation for the LRTS in these models. You should also compare the results to those from simple models ignoring the fact that females may appear more than once. They should be broadly similar. How large is the variance due to the random effects anyway?
Second, I guess that LMER function is optimizing REML by default (if I correctly read the help file), but I had understood that we need to optimize ML to compare fixed effects (from Pinheiro J C & Bates D M, "Mixed-effects models in S and S-PLUS"). If right, what do I need to change to correctly analyzed my data with Likelihood ratio tests ?
REML is not done for GLMMs IIRC. David Duffy. -- | David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On Tue, 16 Dec 2008, Rafael Maia wrote:
Hello, I am quite new to all this approach, but since it's somewhat different from what I've seen in classes and I have to rely on what I have been learning by myself and on lists such as this one, it's easy to get confused. Anyway, I have seen in several textbooks which take this anova table / LRT approach the opposite "direction" of effect testing: starting with the full model, removing terms (interactions first, then according to effect size, for example) and comparing to the previous one. This may even be kind of "off topic" for this list, but since considerable discussion has been going on here about hypothesis testing and LRT previously, and the question was originally asked here, I thought it wouldn't hurt to continue the discussion...
It's a generic model building question. If you have large numbers of variables, fitting the all K-way interactions model as a base model can be expensive, so people try to build upwards from simpler models. I quite like the idea of Bayesian Model Averaging...
There was some discussion here previously about how it wouldn't be good to compare likelihoods of GLMM and GLM, because they are generated differently or something like that...
That was just whether particular constants are included or excluded in the likelihood expressions used in the different codes. My comment was to the effect that the parameter estimates and LRTs from simple fixed effects models are a reality check for the results coming from the GLMMs, especially if you are worried that something isn't working properly. David Duffy.
| David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v