zero variance query
Christine Griffiths wrote:
Dear Emmanuel and Ben Many thanks for your advice. Unfortunately, I don't think that I can offset with log(area), given that each area is the same.
Why not? All the offset does is add a constant (i.e., fixed rather than estimated -- could be the same or different for different observations) to the regression model.
My rationale for converting to m2 was to standardise abundances to 1 m2 as I have other parameters which were measured to different areas.
Don't quite understand this. Parameters from other studies that you want to compare in discussion? If so, you can just rescale your predictions/parameters *after* you estimate them ... I had previously
attempted to normalise my data by logging but felt that it did not improve the distribution. I just hadn't tried it in my modelling. Logging my count data dramatically improved the fit of the model (AIC 116.7 v 312.5), however the variance still remains low. Does this appear acceptable? Furthermore, can I assess model fit of different transformations of the same dataset using AIC values, i.e. compare log(Count) and inverse transformed Count?
No, not without a correction. See http://www.unc.edu/courses/2007spring/enst/562/001/docs/lectures/lecture22.htm Generalized linear modeling is not as flexible (in some ways) as classical linear models -- you can't just transform the data any way you want (in principle I suppose you could, but it's basically not possible to "transform to achieve a Poisson distribution" the way you would transform continuous data to achieve normality etc.)
lncount<-log(Count+1)
m1<-m1<-lmer(lncount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)
Generalized linear mixed model fit by the Laplace approximation
Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
AIC BIC logLik deviance
116.7 135.1 -52.33 104.7
Random effects:
Groups Name Variance Std.Dev.
Month (Intercept) 1.8937e-14 1.3761e-07
Block (Intercept) 3.5018e-02 1.8713e-01
Residual 3.9318e-01 6.2704e-01
Number of obs: 160, groups: Month, 10; Block, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) -0.4004 0.1239 -3.232
Treatment2.Radiata 0.4596 0.1305 3.522
Treatment3.Aldabra 0.4295 0.1334 3.220
Correlation of Fixed Effects:
(Intr) Trt2.R
Trtmnt2.Rdt -0.581
Trtmnt3.Ald -0.577 0.530
I used quasipoisson as my data is overdispersed. It was further improved by
an inverse transformation (AIC 43.54). Again I have small variances.
invcount<-1/(Count+1)
m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m3)
Generalized linear mixed model fit by the Laplace approximation
Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
AIC BIC logLik deviance
43.54 62 -15.77 31.54
Random effects:
Groups Name Variance Std.Dev.
Month (Intercept) 0.0000000 0.000000
Block (Intercept) 0.0021038 0.045867
Residual 0.0926225 0.304339
Number of obs: 160, groups: Month, 10; Block, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) -0.51644 0.05411 -9.545
Treatment2.Radiata -0.36246 0.08401 -4.314
Treatment3.Aldabra -0.29319 0.08197 -3.577
Correlation of Fixed Effects:
(Intr) Trt2.R
Trtmnt2.Rdt -0.566
Trtmnt3.Ald -0.580 0.372
Log(Abundance) did not solve the problem of zero variance. If quasipoisson
errors are not acceptable to use with abundance, i.e. non-integers, is
there a family of errors that would be recommended? Or should I simply
multiply abundance to obtain whole numbers?
Many thanks in advance,
Christine
--On 01 June 2009 23:17 -0400 Ben Bolker <bolker at ufl.edu> wrote:
Emmanuel Charpentier wrote:
Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :
Dear R users, I am having a problem with getting zero variance in my lmer models which specify two random effects. Having scoured the help lists, I have read that this could be because my variables are strongly correlated. However, when I simplify my model I still encounter the same problem. My response variable is abundance which ranges from 0-0.14. Below is an example of my model:
m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisson) summary(m1)
Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
AIC BIC logLik deviance
17.55 36.00 -2.777 5.554
Random effects:
Groups Name Variance Std.Dev.
Month (Intercept) 5.1704e-17 7.1906e-09
Block (Intercept) 0.0000e+00 0.0000e+00
Residual 1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) -3.73144 0.02728 -136.80
Treatment2.Radiata 0.58779 0.03521 16.69
Treatment3.Aldabra 0.47269 0.03606 13.11
Correlation of Fixed Effects:
(Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756 0.586
1. Is it wrong to treat this as count data?
Hmmm... IST vaguely R that, when the world was young and I was (already) silly, Poisson distribution used to be a *discrete* distribution. Of course, this may or may not stand for "quasi"Poisson (for some value of "quasi"). May I inquire if you tried to analyze log(Abundance) (or log(Count), maybe including log(area) in the model) ? HTH, Emmanuel Charpentier
2. I would like to retain these as random factors given that I designed my experiment as a randomised block design and repeated measures, albeit non-orthogonal and unbalanced. Is it acceptable to retain these random factors, is all else is correct?
I think so ...
3. The above response variable was calculated per m2 by dividing the Count by the sample area. When I used the Count (range 0-9) as my response variable, I get a small but reasonable variation of random effects. Could anyone explain why this occurs and whether one response variable is better than another?
To agree with what Emmanuel said above: you should use Count~..., offset=log(area) for the correct analysis ... that should solve both your technical (zero random effects) and conceptual (even quasiPoisson should be discrete data) issues.
m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson) summary(m2)
Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
AIC BIC logLik deviance
312.5 331 -150.3 300.5
Random effects:
Groups Name Variance Std.Dev.
Month (Intercept) 0.14591 0.38198
Block (Intercept) 0.58690 0.76609
Residual 2.79816 1.67277
Number of obs: 160, groups: Month, 10; Block, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.3098 0.3799 0.8155
Treatment2.Radiata 0.5879 0.2299 2.5575
Treatment3.Aldabra 0.5745 0.2382 2.4117
Correlation of Fixed Effects:
(Intr) Trt2.R
Trtmnt2.Rdt -0.347
Trtmnt3.Ald -0.348 0.536
Many thanks,
Christine
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
---------------------- Christine Griffiths School of Biological Sciences University of Bristol Woodland Road Bristol BS8 1UG Tel: 0117 9287593 Fax 0117 925 7374 Christine.Griffiths at bristol.ac.uk http://www.bio.bris.ac.uk/research/mammal/tortoises.html
Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc