multiple nested random factors
Amanda Adams <aadams26 at ...> writes:
I have been having a heck of a time figuring out how to estimate the proportion of variance from several random factors. I have a count data of the number of bat calls recorded at 3 sites, on 6 detectors, over 12 nights. Detectors were at 2 heights. If I understand nested factors correctly, Detectors are nested in Site and Night is nested in Site. Site/Detector and Site/Night are random factors and Height is a fixed factor.
It's still not entirely clear to me from this description how your data are structured. You have an average of about 249/12 ~ 21 observations per night, so I'm going to assume you have 6 detectors *at each site*. Detector will be nested in site (because it doesn't make any sense to analyze what happens at "detector number 1" unless the detectors are somehow arranged so that the set of (d1:site1, d1:site2, d1:site3, ... has something in common). You *may* want a night:site interaction (if you have enough data), but in principle you also want a site factor (probably fixed, since there are only three levels) and a night factor. This would be ~ height + f.Site + (1|f.Night/f.Site) + (1|f.Site:f.Detector) It is quite likely that you will find some of these variance components estimated as zero ...
Also, data is overdispersed so I am transforming number of calls as log(Calls+1).
This makes no sense (sorry). Poisson models must have a response variable that is a raw count value (integer). How do you know the data are overdispersed before you fit a model ??? (Although I do see that you have widely varying values in your 'Calls' variable, so you may be right ...) For various ways of handling overdispersion in GLMMs see http://glmm.wikidot.com/faq I don't know if it's helpful, but Bolker et al. 2009 _Trends in Ecology and Evolution_ might be a citeable source for GLMMs. It doesn't really say anything specific about Poisson variables and why a Poisson model doesn't include a residual variance; for that you should probably cite (after reading!) a basic book on generalized linear models.
'data.frame': 249 obs. of 11 variables: $ Night : int 1 3 5 11 12 1 3 5 11 12 ... $ Night2 : int 1 2 3 4 5 1 2 3 4 5 ... $ Site : int 1 1 1 1 1 1 1 1 1 1 ... $ Species : int 1 1 1 1 1 1 1 1 1 1 ... $ Detector : int 1 1 1 1 1 2 2 2 2 2 ... $ Height : int 1 1 1 1 1 2 2 2 2 2 ... $ Calls : int 6 444 236 12 143 5 815 712 30 142 ... $ f.Night : Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5 ... $ f.Site : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ... $ f.Detector: Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 2 ... $ f.Height : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2 ...
By the way, you said you have three sites, but the data have four levels for f.Site? Did you drop one site from the data and not use droplevels() ?
I then coded for the nested variables:
data$detector <- with(data, factor(f.Site:f.Detector))
data$night <- with(data, factor(f.Site:f.Night))
trans.log <- log(data$Calls+1)
model <- glmer(round(trans.log,digits=0)~ f.Height + (1|night) +
(1|detector) +
(1|f.Site) , data = data, family=poisson)
I am uncertain on a couple things. Are my nested variables correct? Can
I correct for overdispersion with a transformation?
I was also wondering if there is a reference explaining why there is no
residual variance term for the Poisson distribution. I saw the
explanation on a forum, but was hoping there was something I could cite.
Any help or advice would be appreciated.
Thank you!
Amanda