An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20121001/a3b7c717/attachment.pl>
fixed or random effects?
4 messages · Joana Martelo, Douglas Bates, Ben Bolker
On Mon, Oct 1, 2012 at 1:10 PM, joana martelo <jmmartelo at fc.ul.pt> wrote:
I'm modeling fish activity data with a gaussian distribution for scores obtained from Principal Component Analysis, and have a little problem, hopefully simple to resolve. My explanatory variables are group size, fish length and temperature and I sampled in two consecutive years, in spring. My problem is that I'm not sure whether I should consider year as a random or a fixed effect. I wonder if you could help me.
For you the year factor will have only two levels and that is too few to model the effect of year as a random effect. When you incorporate a random-effects term in a model you end up estimating a variance component instead of trying to estimate coefficients in a linear model expression directly. Having only two levels of year will not allow for a precise estimate of a variance component. In fact, it will be a horribly imprecise estimate. There are no hard and fast rules of how many levels are required to be able to estimate a variance component but fewer than 5 is too few and more than 10 is adequate. I have used as few as 6 levels but that was on nicely balanced data from a designed experiment. Observational data that is highly unbalanced requires more care.
joana martelo <jmmartelo at ...> writes:
Hello R list I'm modeling fish activity data with a gaussian distribution for scores obtained from Principal Component Analysis, and have a little problem, hopefully simple to resolve. My explanatory variables are group size, fish length and temperature and I sampled in two consecutive years, in spring. My problem is that I'm not sure whether I should consider year as a random or a fixed effect. I wonder if you could help me.
[snip] As hinted in private e-mail, http://glmm.wikidot.com/faq tells you that while you may *philosophically* want to treat two years as a sample from a larger population of years, it is not computationally practical (nor will it get you much in terms of inferential power) to treat a factor with only two levels as random: you should make it fixed. This then has the added advantage that you have only fixed effects, and you can use lm() instead of getting into any of the complexities of mixed models.
Douglas Bates <bates at ...> writes:
On Mon, Oct 1, 2012 at 1:10 PM, joana martelo <jmmartelo at ...> wrote:
[snip] # For you the year factor will have only two levels and that is too few # to model the effect of year as a random effect. When you incorporate # a random-effects term in a model you end up estimating a variance # component instead of trying to estimate coefficients in a linear model # expression directly. Having only two levels of year will not allow # for a precise estimate of a variance component. In fact, it will be a # horribly imprecise estimate.
There are no hard and fast rules of how many levels are required to be able to estimate a variance component but fewer than 5 is too few and more than 10 is adequate. I have used as few as 6 levels but that was on nicely balanced data from a designed experiment. Observational data that is highly unbalanced requires more care.
jinx (snap): http://separatedbyacommonlanguage.blogspot.ca/2006/10/jinx-and-snap.html (making gmane happy with more stuff)