Dear R-list members,
I am running mixed models using lme4 package. In model selection, terms were
eliminated from a maximum model (with random intercept) to achieve a simpler
model that retained only the significant main effects and interactions,
using the Akaike information criterion. My final model includes three fixed
factors plus random intercept. Then I perform a likelihood ratio test to
test the significance of the random term. However, because when testing on
the boundary the p-value from the table is incorrect, I followed Zuur et al
(2009) to get the corrected p-value by dividing the p value obtained by 2.
Briefly, my best fit model consists of three main effects: Year (2006,
2007), Hatching Order (1st, 2nd, 3rd) and Sibling Competence
(Present/Absent) plus NestID as random intercept. The modelled outcome is
level of plasma proteins (continuous).
I test the random effect (Nest ID), which has variance 2.1795e-16 and Std.
Dev. 1.4763e-08 (see output). LRT yields a p-value of 0.00031 (0.00015 after
dividing it by 2 as suggested). This would mean that adding a random effect
Nest ID to the model is a significant improvement. However, random effect
variance is near zero, which would indicate otherwise.
In support of the non-significant random effect I think, coefficients and
standard error are exactly the same for models with and without the RE, as
seen in the outputs.
Q 1. In your opinion, should I trust this LRT with a small p-value and leave
the random effect in my model, or follow the parsimony principle and
eliminated it?
Q 2. Is it possible, under certain conditions, to have a random effect with
such low variance and still obtain a LTR p-value indicating that model fit
is improved by it?
Outputs for both models, with and without random effect, followed by LRT
output:
MIXED MODEL
Dear R-list members,
I am running mixed models using lme4 package. In model selection, terms were
eliminated from a maximum model (with random intercept) to achieve a simpler
model that retained only the significant main effects and interactions,
using the Akaike information criterion. My final model includes three fixed
factors plus random intercept. Then I perform a likelihood ratio test to
test the significance of the random term.
In general testing significance of components in a model *after* model
selection is dubious ... I agree with most of what Zuur et al say, but
I am only comfortable with stepwise procedures as a relatively necessary
evil for eliminating non-significant interaction terms to simplify
interpretation of the remaining model.
However, because when testing on
the boundary the p-value from the table is incorrect, I followed Zuur et al
(2009) to get the corrected p-value by dividing the p value obtained by 2.
Briefly, my best fit model consists of three main effects: Year (2006,
2007), Hatching Order (1st, 2nd, 3rd) and Sibling Competence
(Present/Absent) plus NestID as random intercept. The modelled outcome is
level of plasma proteins (continuous).
I test the random effect (Nest ID), which has variance 2.1795e-16 and Std.
Dev. 1.4763e-08 (see output). LRT yields a p-value of 0.00031 (0.00015 after
dividing it by 2 as suggested). This would mean that adding a random effect
Nest ID to the model is a significant improvement. However, random effect
variance is near zero, which would indicate otherwise.
In support of the non-significant random effect I think, coefficients and
standard error are exactly the same for models with and without the RE, as
seen in the outputs.
Your main problem is that the log-likelihoods returned by lm and lmer
are **NOT COMPARABLE**. Sooner or later there should probably be a
warning to that effect somewhere in the documentation ...
You may be able to use the RLRsim package to solve your problem.
Q 1. In your opinion, should I trust this LRT with a small p-value and leave
the random effect in my model, or follow the parsimony principle and
eliminated it?
I would leave it in whether or not it is significant (and it's
probably not). Note that as expected all the fixed effect parameters
are estimated identically under lmer and lm ... the reason to drop it
would be to have the convenience of not dealing with mixed effects at all.
Q 2. Is it possible, under certain conditions, to have a random effect with
such low variance and still obtain a LTR p-value indicating that model fit
is improved by it?
Unlikely at best, unless your response variable has a very small
magnitude (e.g., you are comparing differences hummingbird weights
across different diet treatments, and measuring them in units of petagrams)
Outputs for both models, with and without random effect, followed by LRT
output:
MIXED MODEL
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / people.biology.ufl.edu/bolker
GPG key: people.biology.ufl.edu/bolker/benbolker-publickey.asc
There has been a lot of discussion previously whether we should remove
random effects based on LRT. The reason is that you added the random effect
based on your study design and whether it is significant or not it should
remain in there. I am not sure there is any definite rule and maybe it
depends on your study and personal view point.
Dr Anna R. Renwick
Research Ecologist
British Trust for Ornithology,
The Nunnery,
Thetford,
Norfolk,
IP24 2PU,
UK
Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker
Sent: 11 March 2010 14:50
To: Luciano La Sala
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Test of random effect in lme4
Luciano La Sala wrote:
Dear R-list members,
I am running mixed models using lme4 package. In model selection, terms
were
eliminated from a maximum model (with random intercept) to achieve a
simpler
model that retained only the significant main effects and interactions,
using the Akaike information criterion. My final model includes three
fixed
factors plus random intercept. Then I perform a likelihood ratio test to
test the significance of the random term.
In general testing significance of components in a model *after* model
selection is dubious ... I agree with most of what Zuur et al say, but
I am only comfortable with stepwise procedures as a relatively necessary
evil for eliminating non-significant interaction terms to simplify
interpretation of the remaining model.
However, because when testing on
the boundary the p-value from the table is incorrect, I followed Zuur et
al
(2009) to get the corrected p-value by dividing the p value obtained by 2.
Briefly, my best fit model consists of three main effects: Year (2006,
2007), Hatching Order (1st, 2nd, 3rd) and Sibling Competence
(Present/Absent) plus NestID as random intercept. The modelled outcome is
level of plasma proteins (continuous).
I test the random effect (Nest ID), which has variance 2.1795e-16 and Std.
Dev. 1.4763e-08 (see output). LRT yields a p-value of 0.00031 (0.00015
after
dividing it by 2 as suggested). This would mean that adding a random
effect
Nest ID to the model is a significant improvement. However, random effect
variance is near zero, which would indicate otherwise.
In support of the non-significant random effect I think, coefficients and
standard error are exactly the same for models with and without the RE, as
seen in the outputs.
Your main problem is that the log-likelihoods returned by lm and lmer
are **NOT COMPARABLE**. Sooner or later there should probably be a
warning to that effect somewhere in the documentation ...
You may be able to use the RLRsim package to solve your problem.
Q 1. In your opinion, should I trust this LRT with a small p-value and
leave
the random effect in my model, or follow the parsimony principle and
eliminated it?
I would leave it in whether or not it is significant (and it's
probably not). Note that as expected all the fixed effect parameters
are estimated identically under lmer and lm ... the reason to drop it
would be to have the convenience of not dealing with mixed effects at all.
Q 2. Is it possible, under certain conditions, to have a random effect
with
such low variance and still obtain a LTR p-value indicating that model fit
is improved by it?
Unlikely at best, unless your response variable has a very small
magnitude (e.g., you are comparing differences hummingbird weights
across different diet treatments, and measuring them in units of petagrams)
Outputs for both models, with and without random effect, followed by LRT
output:
MIXED MODEL
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / people.biology.ufl.edu/bolker
GPG key: people.biology.ufl.edu/bolker/benbolker-publickey.asc
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
There has been a lot of discussion previously whether we should remove
random effects based on LRT. The reason is that you added the random effect
based on your study design and whether it is significant or not it should
remain in there. I am not sure there is any definite rule and maybe it
depends on your study and personal view point.
Dr Anna R. Renwick
Research Ecologist
British Trust for Ornithology,
The Nunnery,
Thetford,
Norfolk,
IP24 2PU,
UK
Can someone please give some literature references about this
discussion. I am very interested.
According to my experience in Ecology it often happens that someone
needs to understand if a variable or factor is
important/relevant/significant in determining some other variable.
I would like to understand up to which point the arbitrariness of the
experimenter is considered tolerable.
Thanks
Stefano
======================================================================
Stefano Leonardi
Dipartimento di Scienze Ambientali
Universita` di Parma E-mail:stefano.leonardi(at)unipr.it
Viale Usberti 11a Phone : +39-0521-905659
43100 PARMA (Italy) Fax : +39-0521-905402
There is quite a bit on the message board. For example this string:
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q2/000743.html
Dr Anna R. Renwick
Research Ecologist
British Trust for Ornithology,
The Nunnery,
Thetford,
Norfolk,
IP24 2PU,
UK
Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030
-----Original Message-----
From: Stefano Leonardi [mailto:stefano.leonardi at unipr.it]
Sent: 11 March 2010 16:03
To: Anna Renwick
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Test of random effect in lme4
On 11/03/2010 16:30, Anna Renwick wrote:
There has been a lot of discussion previously whether we should remove
random effects based on LRT. The reason is that you added the random
effect
based on your study design and whether it is significant or not it should
remain in there. I am not sure there is any definite rule and maybe it
depends on your study and personal view point.
Dr Anna R. Renwick
Research Ecologist
British Trust for Ornithology,
The Nunnery,
Thetford,
Norfolk,
IP24 2PU,
UK
Can someone please give some literature references about this
discussion. I am very interested.
According to my experience in Ecology it often happens that someone
needs to understand if a variable or factor is
important/relevant/significant in determining some other variable.
I would like to understand up to which point the arbitrariness of the
experimenter is considered tolerable.
Thanks
Stefano
======================================================================
Stefano Leonardi
Dipartimento di Scienze Ambientali
Universita` di Parma E-mail:stefano.leonardi(at)unipr.it
Viale Usberti 11a Phone : +39-0521-905659
43100 PARMA (Italy) Fax : +39-0521-905402