Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
Comparing mixed models
10 messages · Ben Bolker, Carlos Barboza, Jean-Philippe Laurenceau +4 more
My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
yes I agree, my question was just a numerical doubt about comparing AIC values. My approach was to show that, the AIC value from a model including the single fixed effect has a smaller AIC value than any other model only including the interecept effect in the fixed structure thank you 2016-05-07 12:34 GMT-03:00 Ben Bolker <bbolker at gmail.com>:
My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com
wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
1 day later
The problem (depending on what you're trying to do) with comparing model 1 and model 2 is that, if you observe, say, a large change in the AIC, it's not clear what to attribute the change to. It could be driven either by the fixed effect of Sector or by the random intercept for Cage. Maybe it doesn't matter in your case. On Sat, May 7, 2016 at 11:42 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
yes I agree, my question was just a numerical doubt about comparing AIC values. My approach was to show that, the AIC value from a model including the single fixed effect has a smaller AIC value than any other model only including the interecept effect in the fixed structure thank you 2016-05-07 12:34 GMT-03:00 Ben Bolker <bbolker at gmail.com>:
My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <
carlosambarboza at gmail.com
wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see
Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Alex Fine Ph. (336) 302-3251 web: http://internal.psychology.illinois.edu/~abfine/ <http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html> [[alternative HTML version deleted]]
1 day later
Dear Ben et al.--I agree with the general practice of trying to estimate and retain as many random effects as possible (without estimation issues) in a mixed model. However, I was wondering whether anyone had some references recommending or arguing for this approach. I am aware of a paper on this topic with some simulation work by Barr et al. (2013; Journal of Memory and Language), but I would be interested in whether there are others. Thanks, J-P Jean-Philippe Laurenceau, Ph.D. Department of Psychological & Brain Sciences University of Delaware -----Original Message----- From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker Sent: Saturday, May 7, 2016 11:35 AM To: Carlos Barboza <carlosambarboza at gmail.com> Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Comparing mixed models My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i- use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed om-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
There's a newer one out by Bates et al. that is sort of a response to Barr et al.: http://arxiv.org/abs/1506.04967 On Tue, May 10, 2016 at 10:52 PM, Jean-Philippe Laurenceau <
jlaurenceau at psych.udel.edu> wrote:
Dear Ben et al.--I agree with the general practice of trying to estimate and retain as many random effects as possible (without estimation issues) in a mixed model. However, I was wondering whether anyone had some references recommending or arguing for this approach. I am aware of a paper on this topic with some simulation work by Barr et al. (2013; Journal of Memory and Language), but I would be interested in whether there are others. Thanks, J-P Jean-Philippe Laurenceau, Ph.D. Department of Psychological & Brain Sciences University of Delaware -----Original Message----- From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker Sent: Saturday, May 7, 2016 11:35 AM To: Carlos Barboza <carlosambarboza at gmail.com> Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Comparing mixed models My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com
wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i- use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed om-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Alex Fine Ph. (336) 302-3251 web: http://internal.psychology.illinois.edu/~abfine/ <http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html> [[alternative HTML version deleted]]
Dear Jean-Philippe, There are some papers that deal with the special case that the variance of an experimental design random term becomes negative due to a negative intraclass correlation. In old ANOVA models this could be detected as negative variance (this term will earn head shaking...), whereas in mixed models, where the design term is modeled at the random level, this is often not detectable because the design term variance may just be fixed at zero / converge to zero (if restrained to be positive). As a consequence, it happens that people tend to remove design terms from their models (because a zero variance random term clearly does not improve the model) and make inferences about, let's say treatments, based on observational rather than experimental units (that would only be represented by including the experimental design term) and this can lead to unrepeatable and overconfident inferences. This problem cannot always be simply accounted for by leaving the random design term with a zero variance in the model. For example asreml-R does not account for zero-variance terms in F-tests (the denominator degrees of freedom inflate to observational level numbers), not sure what happens in lme4 / nlme models. Here are some references about this very special topic that only covers the issue of zero-variance design terms that may in fact be negative, and how the experimental design can be accounted for at the residual level (with the associated consequences on prediction ability) in alternative to having zero-variance random terms: Nelder, J. A. 1954. The interpretation of negative components of variance. Biometrika 41:544-548. Wang, C. S., B. S. Yandell, and J. J. Rutledge. 1992. The dilemma of negative analysis of variance estimators of intraclass correlation. Theoretical and Applied Genetics 85:79-88. Pryseley, A., C. Tchonlafi, G. Verbeke, and G. Molenberghs. 2011. Estimating negative variance components from Gaussian and non-Gaussian data: A mixed models approach. Computational Statistics & Data Analysis 55:1071-1085. I hope that is not too special case for your question, but I think it is a very important case for making inferences that account for an experimental design, i.e., when a non-significant random term should be left in the model. Best, Paul On Wed, 11 May 2016 05:52:24 +0300, Jean-Philippe Laurenceau
<jlaurenceau at psych.udel.edu> wrote:
Dear Ben et al.--I agree with the general practice of trying to estimate and retain as many random effects as possible (without estimation issues) in a mixed model. However, I was wondering whether anyone had some references recommending or arguing for this approach. I am aware of a paper on this topic with some simulation work by Barr et al. (2013; Journal of Memory and Language), but I would be interested in whether there are others. Thanks, J-P Jean-Philippe Laurenceau, Ph.D. Department of Psychological & Brain Sciences University of Delaware -----Original Message----- From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker Sent: Saturday, May 7, 2016 11:35 AM To: Carlos Barboza <carlosambarboza at gmail.com> Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Comparing mixed models My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i- use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed om-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku PharmaCity, 7th floor Itainen Pitkakatu 4 20014 Finland Email: paul.debes at utu.fi
I have argued for allowing negative random effect estimates to be output, as was and I expect still is the case for Genstat mixed model fits. What does asreml-R do? The negative value is needed so that the variance-covariance matrix, which does have to be positive definite (or at least semi-definite) is correctly estimated. The negative value, if more negative than can be ascribed to chance, is a useful warning device. Someone at Rothamsted told me about getting data where blocks had been chosen in which treatment plots moved successively further away from the stream. The additional systematic within block variance thereby induced called for a negative between blocks random effect so that the variance-covariance matrix would come out ?right?. Maybe Nelder?s paper mentions this specific type of effect? John Maindonald email: john.maindonald at anu.edu.au
On 11/05/2016, at 17:39, Paul Debes <paul.debes at utu.fi> wrote: Dear Jean-Philippe, There are some papers that deal with the special case that the variance of an experimental design random term becomes negative due to a negative intraclass correlation. In old ANOVA models this could be detected as negative variance (this term will earn head shaking...), whereas in mixed models, where the design term is modeled at the random level, this is often not detectable because the design term variance may just be fixed at zero / converge to zero (if restrained to be positive). As a consequence, it happens that people tend to remove design terms from their models (because a zero variance random term clearly does not improve the model) and make inferences about, let's say treatments, based on observational rather than experimental units (that would only be represented by including the experimental design term) and this can lead to unrepeatable and overconfident inferences. This problem cannot always be simply accounted for by leaving the random design term with a zero variance in the model. For example asreml-R does not account for zero-variance terms in F-tests (the denominator degrees of freedom inflate to observational level numbers), not sure what happens in lme4 / nlme models. Here are some references about this very special topic that only covers the issue of zero-variance design terms that may in fact be negative, and how the experimental design can be accounted for at the residual level (with the associated consequences on prediction ability) in alternative to having zero-variance random terms: Nelder, J. A. 1954. The interpretation of negative components of variance. Biometrika 41:544-548. Wang, C. S., B. S. Yandell, and J. J. Rutledge. 1992. The dilemma of negative analysis of variance estimators of intraclass correlation. Theoretical and Applied Genetics 85:79-88. Pryseley, A., C. Tchonlafi, G. Verbeke, and G. Molenberghs. 2011. Estimating negative variance components from Gaussian and non-Gaussian data: A mixed models approach. Computational Statistics & Data Analysis 55:1071-1085. I hope that is not too special case for your question, but I think it is a very important case for making inferences that account for an experimental design, i.e., when a non-significant random term should be left in the model. Best, Paul On Wed, 11 May 2016 05:52:24 +0300, Jean-Philippe Laurenceau <jlaurenceau at psych.udel.edu> wrote:
Dear Ben et al.--I agree with the general practice of trying to estimate and retain as many random effects as possible (without estimation issues) in a mixed model. However, I was wondering whether anyone had some references recommending or arguing for this approach. I am aware of a paper on this topic with some simulation work by Barr et al. (2013; Journal of Memory and Language), but I would be interested in whether there are others. Thanks, J-P Jean-Philippe Laurenceau, Ph.D. Department of Psychological & Brain Sciences University of Delaware -----Original Message----- From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker Sent: Saturday, May 7, 2016 11:35 AM To: Carlos Barboza <carlosambarboza at gmail.com> Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Comparing mixed models My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i- use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed om-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku PharmaCity, 7th floor Itainen Pitkakatu 4 20014 Finland Email: paul.debes at utu.fi
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
ASReml-R does allow for negative variances, but you have to explicitly specify it via the component constraints. I also think this may be advisable to do for testing what is going on, especially when an important design term variance converged to zero. The variance may either simply be very small, which may just ask for a response / covariate rescaling or changing the threshold when the software considers a component to be zero, or be really negative. Otherwise, for 'boundary' variance terms ASReml-R appears to estimate the random effects (you can still extract them from the model) but it does not estimate the variance among them. My guess is that designs described by Nelder occur more often than thought because I still see mention of 'pooling variance' of design terms (or 'stepwise reducing models for non-significant terms'), so it remains unknown what was really going on with these removed design terms. I worked with different fish populations, kept due to space limitations in the same tanks; tanks were the experimental treatment units (split plot design of fish type within treatment tank). Now the fish populations had very different growth for families across treatments (wild vs. aquaculture - what a surprise), leading to a negative variance among tank effects, like what Nelder described. I think this block design in the stream you describe may have exhibited a similar pattern (I think I already read about it in an older post). Back then, I really struggled how to deal with this practically, without running into controversies (I'm a biologist - impossible to be further away from being a statistician), until Geert Molenbeek helped me with bringing up (covered, if I remember correctly, also by some of his publications) that it may be easiest to interpret a negative variance if specified as correlation at the residual level. I did this and was able to include tank effects that did not converge to zero (as I accounted for the negative correlation elsewhere). Thus, I could happily report the negative variance as negative correlation, include tank effects, and report F-test results with the correct denominator degrees of freedom, though the model was more complicated than I wished for. However, for more complicated experimental designs where a negative variance occurs at a level that cannot be moved to the residuals (or be specified directly as a covariance/correlation between other random effect groups, which may also have been a solution for my problem back then), one may have to deal with a negative variance component and risk being fried by reviewers. On Wed, 11 May 2016 09:49:41 +0300, John Maindonald
<john.maindonald at anu.edu.au> wrote:
I have argued for allowing negative random effect estimates to be output, as was and I expect still is the case for Genstat mixed model fits. What does asreml-R do? The negative value is needed so that the variance-covariance matrix, which does have to be positive definite (or at least semi-definite) is correctly estimated. The negative value, if more negative than can be ascribed to chance, is a useful warning device. Someone at Rothamsted told me about getting data where blocks had been chosen in which treatment plots moved successively further away from the stream. The additional systematic within block variance thereby induced called for a negative between blocks random effect so that the variance-covariance matrix would come out ?right?. Maybe Nelder?s paper mentions this specific type of effect? John Maindonald email: john.maindonald at anu.edu.au
On 11/05/2016, at 17:39, Paul Debes <paul.debes at utu.fi> wrote: Dear Jean-Philippe, There are some papers that deal with the special case that the variance of an experimental design random term becomes negative due to a negative intraclass correlation. In old ANOVA models this could be detected as negative variance (this term will earn head shaking...), whereas in mixed models, where the design term is modeled at the random level, this is often not detectable because the design term variance may just be fixed at zero / converge to zero (if restrained to be positive). As a consequence, it happens that people tend to remove design terms from their models (because a zero variance random term clearly does not improve the model) and make inferences about, let's say treatments, based on observational rather than experimental units (that would only be represented by including the experimental design term) and this can lead to unrepeatable and overconfident inferences. This problem cannot always be simply accounted for by leaving the random design term with a zero variance in the model. For example asreml-R does not account for zero-variance terms in F-tests (the denominator degrees of freedom inflate to observational level numbers), not sure what happens in lme4 / nlme models. Here are some references about this very special topic that only covers the issue of zero-variance design terms that may in fact be negative, and how the experimental design can be accounted for at the residual level (with the associated consequences on prediction ability) in alternative to having zero-variance random terms: Nelder, J. A. 1954. The interpretation of negative components of variance. Biometrika 41:544-548. Wang, C. S., B. S. Yandell, and J. J. Rutledge. 1992. The dilemma of negative analysis of variance estimators of intraclass correlation. Theoretical and Applied Genetics 85:79-88. Pryseley, A., C. Tchonlafi, G. Verbeke, and G. Molenberghs. 2011. Estimating negative variance components from Gaussian and non-Gaussian data: A mixed models approach. Computational Statistics & Data Analysis 55:1071-1085. I hope that is not too special case for your question, but I think it is a very important case for making inferences that account for an experimental design, i.e., when a non-significant random term should be left in the model. Best, Paul On Wed, 11 May 2016 05:52:24 +0300, Jean-Philippe Laurenceau <jlaurenceau at psych.udel.edu> wrote:
Dear Ben et al.--I agree with the general practice of trying to estimate and retain as many random effects as possible (without estimation issues) in a mixed model. However, I was wondering whether anyone had some references recommending or arguing for this approach. I am aware of a paper on this topic with some simulation work by Barr et al. (2013; Journal of Memory and Language), but I would be interested in whether there are others. Thanks, J-P Jean-Philippe Laurenceau, Ph.D. Department of Psychological & Brain Sciences University of Delaware -----Original Message----- From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker Sent: Saturday, May 7, 2016 11:35 AM To: Carlos Barboza <carlosambarboza at gmail.com> Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Comparing mixed models My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i- use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed om-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku PharmaCity, 7th floor Itainen Pitkakatu 4 20014 Finland Email: paul.debes at utu.fi
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku PharmaCity, 7th floor Itainen Pitkakatu 4 20014 Finland Email: paul.debes at utu.fi
This is a fortunes candidate. I'm a biologist - impossible to be further away from being a statistician. -- Paul Debes Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2016-05-11 10:04 GMT+02:00 Paul Debes <paul.debes at utu.fi>:
ASReml-R does allow for negative variances, but you have to explicitly specify it via the component constraints. I also think this may be advisable to do for testing what is going on, especially when an important design term variance converged to zero. The variance may either simply be very small, which may just ask for a response / covariate rescaling or changing the threshold when the software considers a component to be zero, or be really negative. Otherwise, for 'boundary' variance terms ASReml-R appears to estimate the random effects (you can still extract them from the model) but it does not estimate the variance among them. My guess is that designs described by Nelder occur more often than thought because I still see mention of 'pooling variance' of design terms (or 'stepwise reducing models for non-significant terms'), so it remains unknown what was really going on with these removed design terms. I worked with different fish populations, kept due to space limitations in the same tanks; tanks were the experimental treatment units (split plot design of fish type within treatment tank). Now the fish populations had very different growth for families across treatments (wild vs. aquaculture - what a surprise), leading to a negative variance among tank effects, like what Nelder described. I think this block design in the stream you describe may have exhibited a similar pattern (I think I already read about it in an older post). Back then, I really struggled how to deal with this practically, without running into controversies (I'm a biologist - impossible to be further away from being a statistician), until Geert Molenbeek helped me with bringing up (covered, if I remember correctly, also by some of his publications) that it may be easiest to interpret a negative variance if specified as correlation at the residual level. I did this and was able to include tank effects that did not converge to zero (as I accounted for the negative correlation elsewhere). Thus, I could happily report the negative variance as negative correlation, include tank effects, and report F-test results with the correct denominator degrees of freedom, though the model was more complicated than I wished for. However, for more complicated experimental designs where a negative variance occurs at a level that cannot be moved to the residuals (or be specified directly as a covariance/correlation between other random effect groups, which may also have been a solution for my problem back then), one may have to deal with a negative variance component and risk being fried by reviewers. On Wed, 11 May 2016 09:49:41 +0300, John Maindonald <john.maindonald at anu.edu.au> wrote:
I have argued for allowing negative random effect estimates to be output, as was and I expect still is the case for Genstat mixed model fits. What does asreml-R do? The negative value is needed so that the variance-covariance matrix, which does have to be positive definite (or at least semi-definite) is correctly estimated. The negative value, if more negative than can be ascribed to chance, is a useful warning device. Someone at Rothamsted told me about getting data where blocks had been chosen in which treatment plots moved successively further away from the stream. The additional systematic within block variance thereby induced called for a negative between blocks random effect so that the variance-covariance matrix would come out ?right?. Maybe Nelder?s paper mentions this specific type of effect? John Maindonald email: john.maindonald at anu.edu.au
On 11/05/2016, at 17:39, Paul Debes <paul.debes at utu.fi> wrote: Dear Jean-Philippe, There are some papers that deal with the special case that the variance of an experimental design random term becomes negative due to a negative intraclass correlation. In old ANOVA models this could be detected as negative variance (this term will earn head shaking...), whereas in mixed models, where the design term is modeled at the random level, this is often not detectable because the design term variance may just be fixed at zero / converge to zero (if restrained to be positive). As a consequence, it happens that people tend to remove design terms from their models (because a zero variance random term clearly does not improve the model) and make inferences about, let's say treatments, based on observational rather than experimental units (that would only be represented by including the experimental design term) and this can lead to unrepeatable and overconfident inferences. This problem cannot always be simply accounted for by leaving the random design term with a zero variance in the model. For example asreml-R does not account for zero-variance terms in F-tests (the denominator degrees of freedom inflate to observational level numbers), not sure what happens in lme4 / nlme models. Here are some references about this very special topic that only covers the issue of zero-variance design terms that may in fact be negative, and how the experimental design can be accounted for at the residual level (with the associated consequences on prediction ability) in alternative to having zero-variance random terms: Nelder, J. A. 1954. The interpretation of negative components of variance. Biometrika 41:544-548. Wang, C. S., B. S. Yandell, and J. J. Rutledge. 1992. The dilemma of negative analysis of variance estimators of intraclass correlation. Theoretical and Applied Genetics 85:79-88. Pryseley, A., C. Tchonlafi, G. Verbeke, and G. Molenberghs. 2011. Estimating negative variance components from Gaussian and non-Gaussian data: A mixed models approach. Computational Statistics & Data Analysis 55:1071-1085. I hope that is not too special case for your question, but I think it is a very important case for making inferences that account for an experimental design, i.e., when a non-significant random term should be left in the model. Best, Paul On Wed, 11 May 2016 05:52:24 +0300, Jean-Philippe Laurenceau <jlaurenceau at psych.udel.edu> wrote:
Dear Ben et al.--I agree with the general practice of trying to estimate and retain as many random effects as possible (without estimation issues) in a mixed model. However, I was wondering whether anyone had some references recommending or arguing for this approach. I am aware of a paper on this topic with some simulation work by Barr et al. (2013; Journal of Memory and Language), but I would be interested in whether there are others. Thanks, J-P Jean-Philippe Laurenceau, Ph.D. Department of Psychological & Brain Sciences University of Delaware -----Original Message----- From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker Sent: Saturday, May 7, 2016 11:35 AM To: Carlos Barboza <carlosambarboza at gmail.com> Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Comparing mixed models My only other comment would be that my standard approach would be to retain all random effects in the model unless they are causing difficulty in model fitting -- this depends on your goal (confirmation/testing, prediction, exploration) On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza <carlosambarboza at gmail.com> wrote:
Dear Dr. Ben Bolker My name is Carlos Barboza and I am a Marine Biologist from the Rio de Janeiro University, Brazil. First it's a pleasure to again have the opportunity to send you a message.The reason for it is a simple doubt: Can I compare AIC from: 1. glmmADMB: Density ~ 1 + 1|Site 2. glmmADMB: Density ~ Sector + 1|Site + Cage Note that they have different random and fixed structures. I know that this is not the best choice to model selection but, I think that the AIC values can be compared. thank you very much for your attention is Cage a random effect? Are you intentionally leaving out the intercept in the second case (it will be included anyway unless you use -1)? In any case, I don't see any obvious reason you can't compare AIC values; see https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i- use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed om-for-a-random-effect Follow-ups to r-sig-mixed-models at r-project.org, please ... sorry, yes, cage was included only to examplify a different random structure in the second case...it should be coded (1|Site) + (1|Cage) yes, I know that the intercept will be included in the second model it's an example of comparing AIC values from mixed models with different fixed and random structures: 1. Density ~ 1 + 1|Site 2. Density ~ Sector + 1|Site + 1|Cage comparing AIC...I beleive that both values can be compared again, thank you very much for your very fast message
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku PharmaCity, 7th floor Itainen Pitkakatu 4 20014 Finland Email: paul.debes at utu.fi
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku PharmaCity, 7th floor Itainen Pitkakatu 4 20014 Finland Email: paul.debes at utu.fi
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models