zero variance query

8 messages · Christine Griffiths, Emmanuel Charpentier, Ben Bolker +1 more

Original

1

8

Christine Griffiths

Mon, Jun 1, 2009 10:00 AM #

Dear R users,

I am having a problem with getting zero variance in my lmer models which 
specify two random effects. Having scoured the help lists, I have read that 
this could be because my variables are strongly correlated. However, when I 
simplify my model I still encounter the same problem.

My response variable is abundance which ranges from 0-0.14.

Below is an example of my model:

Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 17.55 36.00 -2.777    5.554
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 5.1704e-17 7.1906e-09
 Block    (Intercept) 0.0000e+00 0.0000e+00
 Residual             1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -3.73144    0.02728 -136.80
Treatment2.Radiata  0.58779    0.03521   16.69
Treatment3.Aldabra  0.47269    0.03606   13.11

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756  0.586

1. Is it wrong to treat this as count data?
2. I would like to retain these as random factors given that I designed my 
experiment as a randomised block design and repeated measures, albeit 
non-orthogonal and unbalanced. Is it acceptable to retain these random 
factors, is all else is correct?
3. The above response variable was calculated per m2 by dividing the Count 
by the sample area. When I used the Count (range 0-9) as my response 
variable, I get a small but reasonable variation of random effects. Could 
anyone explain why this occurs and whether one response variable is better 
than another?

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 312.5 331 -150.3    300.5
Random effects:
 Groups   Name        Variance Std.Dev.
 Month    (Intercept) 0.14591  0.38198
 Block    (Intercept) 0.58690  0.76609
 Residual             2.79816  1.67277
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)          0.3098     0.3799  0.8155
Treatment2.Radiata   0.5879     0.2299  2.5575
Treatment3.Aldabra   0.5745     0.2382  2.4117

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.347
Trtmnt3.Ald -0.348  0.536

Many thanks,
Christine

Emmanuel Charpentier

Mon, Jun 1, 2009 4:07 PM #

Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :

Hmmm... IST vaguely R that, when the world was young and I was (already)
silly, Poisson distribution used to be a *discrete* distribution. Of
course, this may or may not stand for "quasi"Poisson (for some value of
"quasi").

May I inquire if you tried to analyze log(Abundance) (or log(Count),
maybe including log(area) in the model) ?

HTH,

					Emmanuel Charpentier

Ben Bolker

Mon, Jun 1, 2009 8:17 PM #

Emmanuel Charpentier wrote:

Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :

Dear R users,

I am having a problem with getting zero variance in my lmer models which 
specify two random effects. Having scoured the help lists, I have read that 
this could be because my variables are strongly correlated. However, when I 
simplify my model I still encounter the same problem.

My response variable is abundance which ranges from 0-0.14.

Below is an example of my model:

m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)

Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 17.55 36.00 -2.777    5.554
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 5.1704e-17 7.1906e-09
 Block    (Intercept) 0.0000e+00 0.0000e+00
 Residual             1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -3.73144    0.02728 -136.80
Treatment2.Radiata  0.58779    0.03521   16.69
Treatment3.Aldabra  0.47269    0.03606   13.11

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756  0.586

1. Is it wrong to treat this as count data?

Hmmm... IST vaguely R that, when the world was young and I was (already)
silly, Poisson distribution used to be a *discrete* distribution. Of
course, this may or may not stand for "quasi"Poisson (for some value of
"quasi").

May I inquire if you tried to analyze log(Abundance) (or log(Count),
maybe including log(area) in the model) ?

HTH,

					Emmanuel Charpentier

2. I would like to retain these as random factors given that I designed my 
experiment as a randomised block design and repeated measures, albeit 
non-orthogonal and unbalanced. Is it acceptable to retain these random 
factors, is all else is correct?

I think so ...

To agree with what Emmanuel said above: you should use Count~...,
offset=log(area) for the correct analysis ...  that should solve
both your technical (zero random effects) and conceptual (even
quasiPoisson should be discrete data) issues.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc

Christine Griffiths

Tue, Jun 2, 2009 2:45 AM #

Dear Emmanuel and Ben

Many thanks for your advice. Unfortunately, I don't think that I can offset 
with log(area), given that each area is the same. My rationale for 
converting to m2 was to standardise abundances to 1 m2 as I have other 
parameters which were measured to different areas. I had previously 
attempted to normalise my data by logging but felt that it did not improve 
the distribution. I just hadn't tried it in my modelling. Logging my count 
data dramatically improved the fit of the model (AIC 116.7 v 312.5), 
however the variance still remains low. Does this appear acceptable?
Furthermore, can I assess model fit of different transformations of the 
same dataset using AIC values, i.e. compare log(Count) and inverse 
transformed Count?

lncount<-log(Count+1)
m1<-m1<-lmer(lncount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)
Generalized linear mixed model fit by the Laplace approximation
Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 116.7 135.1 -52.33    104.7
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 1.8937e-14 1.3761e-07
 Block    (Intercept) 3.5018e-02 1.8713e-01
 Residual             3.9318e-01 6.2704e-01
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)         -0.4004     0.1239  -3.232
Treatment2.Radiata   0.4596     0.1305   3.522
Treatment3.Aldabra   0.4295     0.1334   3.220

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.581
Trtmnt3.Ald -0.577  0.530

I used quasipoisson as my data is overdispersed. It was further improved by 
an inverse transformation (AIC 43.54). Again I have small variances.

invcount<-1/(Count+1)
m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m3)
Generalized linear mixed model fit by the Laplace approximation
Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 43.54  62 -15.77    31.54
Random effects:
 Groups   Name        Variance  Std.Dev.
 Month    (Intercept) 0.0000000 0.000000
 Block    (Intercept) 0.0021038 0.045867
 Residual             0.0926225 0.304339
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -0.51644    0.05411  -9.545
Treatment2.Radiata -0.36246    0.08401  -4.314
Treatment3.Aldabra -0.29319    0.08197  -3.577

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.566
Trtmnt3.Ald -0.580  0.372

Log(Abundance) did not solve the problem of zero variance. If quasipoisson 
errors are not acceptable to use with abundance, i.e. non-integers, is 
there a family of errors that would be recommended? Or should I simply 
multiply abundance to obtain whole numbers?

Many thanks in advance,
Christine

--On 01 June 2009 23:17 -0400 Ben Bolker <bolker at ufl.edu> wrote:

Emmanuel Charpentier wrote:

Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :

Dear R users,

I am having a problem with getting zero variance in my lmer models
which  specify two random effects. Having scoured the help lists, I
have read that  this could be because my variables are strongly
correlated. However, when I  simplify my model I still encounter the
same problem.

My response variable is abundance which ranges from 0-0.14.

Below is an example of my model:

m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)

Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 17.55 36.00 -2.777    5.554
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 5.1704e-17 7.1906e-09
 Block    (Intercept) 0.0000e+00 0.0000e+00
 Residual             1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -3.73144    0.02728 -136.80
Treatment2.Radiata  0.58779    0.03521   16.69
Treatment3.Aldabra  0.47269    0.03606   13.11

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756  0.586

1. Is it wrong to treat this as count data?

Hmmm... IST vaguely R that, when the world was young and I was (already)
silly, Poisson distribution used to be a *discrete* distribution. Of
course, this may or may not stand for "quasi"Poisson (for some value of
"quasi").

May I inquire if you tried to analyze log(Abundance) (or log(Count),
maybe including log(area) in the model) ?

HTH,

					Emmanuel Charpentier

2. I would like to retain these as random factors given that I designed
my  experiment as a randomised block design and repeated measures,
albeit  non-orthogonal and unbalanced. Is it acceptable to retain these
random  factors, is all else is correct?

   I think so ...

3. The above response variable was calculated per m2 by dividing the
Count  by the sample area. When I used the Count (range 0-9) as my
response  variable, I get a small but reasonable variation of random
effects. Could  anyone explain why this occurs and whether one response
variable is better  than another?

  To agree with what Emmanuel said above: you should use Count~...,
offset=log(area) for the correct analysis ...  that should solve
both your technical (zero random effects) and conceptual (even
quasiPoisson should be discrete data) issues.

m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m2)

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 312.5 331 -150.3    300.5
Random effects:
 Groups   Name        Variance Std.Dev.
 Month    (Intercept) 0.14591  0.38198
 Block    (Intercept) 0.58690  0.76609
 Residual             2.79816  1.67277
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)          0.3098     0.3799  0.8155
Treatment2.Radiata   0.5879     0.2299  2.5575
Treatment3.Aldabra   0.5745     0.2382  2.4117

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.347
Trtmnt3.Ald -0.348  0.536

Many thanks,
Christine

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

----------------------
Christine Griffiths
School of Biological Sciences
University of Bristol
Woodland Road
Bristol BS8 1UG
Tel: 0117 9287593
Fax 0117 925 7374
Christine.Griffiths at bristol.ac.uk
http://www.bio.bris.ac.uk/research/mammal/tortoises.html

Ben Bolker

Tue, Jun 2, 2009 4:43 AM #

Christine Griffiths wrote:

Why not?  All the offset does is add a constant (i.e., fixed rather
than estimated -- could be the same or different for different
observations) to the regression model.

Don't quite understand this.  Parameters from other studies that you
want to compare in discussion?  If so, you can just rescale your
predictions/parameters *after* you estimate them ...

I had previously

No, not without a correction.  See
http://www.unc.edu/courses/2007spring/enst/562/001/docs/lectures/lecture22.htm

  Generalized linear modeling is not as flexible (in some ways) as
classical linear models -- you can't just transform the data any way you
want (in principle I suppose you could,  but it's basically not possible
to "transform to achieve a Poisson distribution" the way you would
transform continuous data to achieve normality etc.)

lncount<-log(Count+1)
m1<-m1<-lmer(lncount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)
Generalized linear mixed model fit by the Laplace approximation
Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 116.7 135.1 -52.33    104.7
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 1.8937e-14 1.3761e-07
 Block    (Intercept) 3.5018e-02 1.8713e-01
 Residual             3.9318e-01 6.2704e-01
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)         -0.4004     0.1239  -3.232
Treatment2.Radiata   0.4596     0.1305   3.522
Treatment3.Aldabra   0.4295     0.1334   3.220

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.581
Trtmnt3.Ald -0.577  0.530

I used quasipoisson as my data is overdispersed. It was further improved by 
an inverse transformation (AIC 43.54). Again I have small variances.

invcount<-1/(Count+1)
m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m3)
Generalized linear mixed model fit by the Laplace approximation
Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 43.54  62 -15.77    31.54
Random effects:
 Groups   Name        Variance  Std.Dev.
 Month    (Intercept) 0.0000000 0.000000
 Block    (Intercept) 0.0021038 0.045867
 Residual             0.0926225 0.304339
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -0.51644    0.05411  -9.545
Treatment2.Radiata -0.36246    0.08401  -4.314
Treatment3.Aldabra -0.29319    0.08197  -3.577

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.566
Trtmnt3.Ald -0.580  0.372

Log(Abundance) did not solve the problem of zero variance. If quasipoisson 
errors are not acceptable to use with abundance, i.e. non-integers, is 
there a family of errors that would be recommended? Or should I simply 
multiply abundance to obtain whole numbers?

Many thanks in advance,
Christine


--On 01 June 2009 23:17 -0400 Ben Bolker <bolker at ufl.edu> wrote:

Emmanuel Charpentier wrote:

Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :

Dear R users,

I am having a problem with getting zero variance in my lmer models
which  specify two random effects. Having scoured the help lists, I
have read that  this could be because my variables are strongly
correlated. However, when I  simplify my model I still encounter the
same problem.

My response variable is abundance which ranges from 0-0.14.

Below is an example of my model:

m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)

Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 17.55 36.00 -2.777    5.554
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 5.1704e-17 7.1906e-09
 Block    (Intercept) 0.0000e+00 0.0000e+00
 Residual             1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -3.73144    0.02728 -136.80
Treatment2.Radiata  0.58779    0.03521   16.69
Treatment3.Aldabra  0.47269    0.03606   13.11

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756  0.586

1. Is it wrong to treat this as count data?

Hmmm... IST vaguely R that, when the world was young and I was (already)
silly, Poisson distribution used to be a *discrete* distribution. Of
course, this may or may not stand for "quasi"Poisson (for some value of
"quasi").

May I inquire if you tried to analyze log(Abundance) (or log(Count),
maybe including log(area) in the model) ?

HTH,

					Emmanuel Charpentier

2. I would like to retain these as random factors given that I designed
my  experiment as a randomised block design and repeated measures,
albeit  non-orthogonal and unbalanced. Is it acceptable to retain these
random  factors, is all else is correct?

   I think so ...

3. The above response variable was calculated per m2 by dividing the
Count  by the sample area. When I used the Count (range 0-9) as my
response  variable, I get a small but reasonable variation of random
effects. Could  anyone explain why this occurs and whether one response
variable is better  than another?

  To agree with what Emmanuel said above: you should use Count~...,
offset=log(area) for the correct analysis ...  that should solve
both your technical (zero random effects) and conceptual (even
quasiPoisson should be discrete data) issues.

m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m2)

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 312.5 331 -150.3    300.5
Random effects:
 Groups   Name        Variance Std.Dev.
 Month    (Intercept) 0.14591  0.38198
 Block    (Intercept) 0.58690  0.76609
 Residual             2.79816  1.67277
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)          0.3098     0.3799  0.8155
Treatment2.Radiata   0.5879     0.2299  2.5575
Treatment3.Aldabra   0.5745     0.2382  2.4117

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.347
Trtmnt3.Ald -0.348  0.536

Many thanks,
Christine

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc

Tue, Jun 2, 2009 10:54 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20090602/3568b0e3/attachment.pl>

Emmanuel Charpentier

Tue, Jun 2, 2009 11:07 AM #

Le mardi 02 juin 2009 ? 10:45 +0100, Christine Griffiths a ?crit :

Aaargh ! I thought that "lmer(lncount~Treatment+(1|Month)+(1|
Block),family=gaussian)" (+/- log(area somewhere in the model or the
fit) might give good hints as a first (known bad) approximation, and
didn't made myself clear. That's what happens when you try to be
sarcastic while dog-tired...

And maybe your random effects *are* quite low. What happens with :
summary(glm(Count~Treatment+Month+Block, family=quasipoisson) (or
something to that effect) ? What says a simple crosstabulation ? or a
graph ("caterpillar", for example) ?

HTH,

					Emmanuel Charpentier

summary(m1)
Generalized linear mixed model fit by the Laplace approximation
Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 116.7 135.1 -52.33    104.7
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 1.8937e-14 1.3761e-07
 Block    (Intercept) 3.5018e-02 1.8713e-01
 Residual             3.9318e-01 6.2704e-01
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)         -0.4004     0.1239  -3.232
Treatment2.Radiata   0.4596     0.1305   3.522
Treatment3.Aldabra   0.4295     0.1334   3.220

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.581
Trtmnt3.Ald -0.577  0.530

I used quasipoisson as my data is overdispersed. It was further improved by 
an inverse transformation (AIC 43.54). Again I have small variances.

invcount<-1/(Count+1)
m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m3)
Generalized linear mixed model fit by the Laplace approximation
Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 43.54  62 -15.77    31.54
Random effects:
 Groups   Name        Variance  Std.Dev.
 Month    (Intercept) 0.0000000 0.000000
 Block    (Intercept) 0.0021038 0.045867
 Residual             0.0926225 0.304339
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -0.51644    0.05411  -9.545
Treatment2.Radiata -0.36246    0.08401  -4.314
Treatment3.Aldabra -0.29319    0.08197  -3.577

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.566
Trtmnt3.Ald -0.580  0.372

Log(Abundance) did not solve the problem of zero variance. If quasipoisson 
errors are not acceptable to use with abundance, i.e. non-integers, is 
there a family of errors that would be recommended? Or should I simply 
multiply abundance to obtain whole numbers?

Many thanks in advance,
Christine


--On 01 June 2009 23:17 -0400 Ben Bolker <bolker-hnhdhkBXzx8 at public.gmane.org> wrote:

Emmanuel Charpentier wrote:

Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :

Dear R users,

I am having a problem with getting zero variance in my lmer models
which  specify two random effects. Having scoured the help lists, I
have read that  this could be because my variables are strongly
correlated. However, when I  simplify my model I still encounter the
same problem.

My response variable is abundance which ranges from 0-0.14.

Below is an example of my model:

m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m1)

Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 17.55 36.00 -2.777    5.554
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 5.1704e-17 7.1906e-09
 Block    (Intercept) 0.0000e+00 0.0000e+00
 Residual             1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -3.73144    0.02728 -136.80
Treatment2.Radiata  0.58779    0.03521   16.69
Treatment3.Aldabra  0.47269    0.03606   13.11

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756  0.586

1. Is it wrong to treat this as count data?

Hmmm... IST vaguely R that, when the world was young and I was (already)
silly, Poisson distribution used to be a *discrete* distribution. Of
course, this may or may not stand for "quasi"Poisson (for some value of
"quasi").

May I inquire if you tried to analyze log(Abundance) (or log(Count),
maybe including log(area) in the model) ?

HTH,

					Emmanuel Charpentier

2. I would like to retain these as random factors given that I designed
my  experiment as a randomised block design and repeated measures,
albeit  non-orthogonal and unbalanced. Is it acceptable to retain these
random  factors, is all else is correct?

   I think so ...

3. The above response variable was calculated per m2 by dividing the
Count  by the sample area. When I used the Count (range 0-9) as my
response  variable, I get a small but reasonable variation of random
effects. Could  anyone explain why this occurs and whether one response
variable is better  than another?

  To agree with what Emmanuel said above: you should use Count~...,
offset=log(area) for the correct analysis ...  that should solve
both your technical (zero random effects) and conceptual (even
quasiPoisson should be discrete data) issues.

m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m2)

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 312.5 331 -150.3    300.5
Random effects:
 Groups   Name        Variance Std.Dev.
 Month    (Intercept) 0.14591  0.38198
 Block    (Intercept) 0.58690  0.76609
 Residual             2.79816  1.67277
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)          0.3098     0.3799  0.8155
Treatment2.Radiata   0.5879     0.2299  2.5575
Treatment3.Aldabra   0.5745     0.2382  2.4117

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.347
Trtmnt3.Ald -0.348  0.536

Many thanks,
Christine

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Christine Griffiths

Wed, Jun 3, 2009 3:05 AM #

Dear All

Thank you for all your incredibly helpful advice. Having done as suggested 
by Emmanuel, I have come to the conclusion that there is basically very 
little variation among my data, which is perhaps not surprising given that 
I have a small count range 0-9. I was just concerned that such low variance 
meant that there was something wrong with my analysis.

Emmanuel, logging my data does not improve the distribution and so I think 
I cannot use Gaussian errors. I have a large number of zeros. These are off 
biological significance and so I am interested in retaining them in the 
model.

Since Anna mentioned that quasi models are unstable in lme4, would you 
recommend using poisson models and calculating QAIC instead?

Many thanks
Christine


--On 02 June 2009 20:07 +0200 Emmanuel Charpentier

<charpent at bacbuc.dyndns.org> wrote:

Le mardi 02 juin 2009 ? 10:45 +0100, Christine Griffiths a ?crit :

Dear Emmanuel and Ben

Many thanks for your advice. Unfortunately, I don't think that I can
offset  with log(area), given that each area is the same. My rationale
for  converting to m2 was to standardise abundances to 1 m2 as I have
other  parameters which were measured to different areas. I had
previously  attempted to normalise my data by logging but felt that it
did not improve  the distribution. I just hadn't tried it in my
modelling. Logging my count  data dramatically improved the fit of the
model (AIC 116.7 v 312.5),  however the variance still remains low. Does
this appear acceptable? Furthermore, can I assess model fit of different
transformations of the  same dataset using AIC values, i.e. compare
log(Count) and inverse  transformed Count?

lncount<-log(Count+1)
m1<-m1<-lmer(lncount~Treatment+(1|Month)+(1|Block),family=quasipoisson)

Aaargh ! I thought that "lmer(lncount~Treatment+(1|Month)+(1|
Block),family=gaussian)" (+/- log(area somewhere in the model or the
fit) might give good hints as a first (known bad) approximation, and
didn't made myself clear. That's what happens when you try to be
sarcastic while dog-tired...

And maybe your random effects *are* quite low. What happens with :
summary(glm(Count~Treatment+Month+Block, family=quasipoisson) (or
something to that effect) ? What says a simple crosstabulation ? or a
graph ("caterpillar", for example) ?

HTH,

					Emmanuel Charpentier

summary(m1)
Generalized linear mixed model fit by the Laplace approximation
Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 116.7 135.1 -52.33    104.7
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 1.8937e-14 1.3761e-07
 Block    (Intercept) 3.5018e-02 1.8713e-01
 Residual             3.9318e-01 6.2704e-01
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)         -0.4004     0.1239  -3.232
Treatment2.Radiata   0.4596     0.1305   3.522
Treatment3.Aldabra   0.4295     0.1334   3.220

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.581
Trtmnt3.Ald -0.577  0.530

I used quasipoisson as my data is overdispersed. It was further improved
by  an inverse transformation (AIC 43.54). Again I have small variances.

invcount<-1/(Count+1)
m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m3)
Generalized linear mixed model fit by the Laplace approximation
Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 43.54  62 -15.77    31.54
Random effects:
 Groups   Name        Variance  Std.Dev.
 Month    (Intercept) 0.0000000 0.000000
 Block    (Intercept) 0.0021038 0.045867
 Residual             0.0926225 0.304339
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -0.51644    0.05411  -9.545
Treatment2.Radiata -0.36246    0.08401  -4.314
Treatment3.Aldabra -0.29319    0.08197  -3.577

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.566
Trtmnt3.Ald -0.580  0.372

Log(Abundance) did not solve the problem of zero variance. If
quasipoisson  errors are not acceptable to use with abundance, i.e.
non-integers, is  there a family of errors that would be recommended? Or
should I simply  multiply abundance to obtain whole numbers?

Many thanks in advance,
Christine


--On 01 June 2009 23:17 -0400 Ben Bolker
<bolker-hnhdhkBXzx8 at public.gmane.org> wrote:

Emmanuel Charpentier wrote:

Le lundi 01 juin 2009 ? 18:00 +0100, Christine Griffiths a ?crit :

Dear R users,

I am having a problem with getting zero variance in my lmer models
which  specify two random effects. Having scoured the help lists, I
have read that  this could be because my variables are strongly
correlated. However, when I  simplify my model I still encounter the
same problem.

My response variable is abundance which ranges from 0-0.14.

Below is an example of my model:

m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisso
n) summary(m1)

Generalized linear mixed model fit by the Laplace approximation
Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
   AIC   BIC logLik deviance
 17.55 36.00 -2.777    5.554
Random effects:
 Groups   Name        Variance   Std.Dev.
 Month    (Intercept) 5.1704e-17 7.1906e-09
 Block    (Intercept) 0.0000e+00 0.0000e+00
 Residual             1.0695e-03 3.2704e-02
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)        -3.73144    0.02728 -136.80
Treatment2.Radiata  0.58779    0.03521   16.69
Treatment3.Aldabra  0.47269    0.03606   13.11

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.775
Trtmnt3.Ald -0.756  0.586

1. Is it wrong to treat this as count data?

Hmmm... IST vaguely R that, when the world was young and I was
(already) silly, Poisson distribution used to be a *discrete*
distribution. Of course, this may or may not stand for "quasi"Poisson
(for some value of "quasi").

May I inquire if you tried to analyze log(Abundance) (or log(Count),
maybe including log(area) in the model) ?

HTH,

					Emmanuel Charpentier

2. I would like to retain these as random factors given that I
designed my  experiment as a randomised block design and repeated
measures, albeit  non-orthogonal and unbalanced. Is it acceptable to
retain these random  factors, is all else is correct?

   I think so ...

3. The above response variable was calculated per m2 by dividing the
Count  by the sample area. When I used the Count (range 0-9) as my
response  variable, I get a small but reasonable variation of random
effects. Could  anyone explain why this occurs and whether one
response variable is better  than another?

  To agree with what Emmanuel said above: you should use Count~...,
offset=log(area) for the correct analysis ...  that should solve
both your technical (zero random effects) and conceptual (even
quasiPoisson should be discrete data) issues.

m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson)
summary(m2)

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
   AIC BIC logLik deviance
 312.5 331 -150.3    300.5
Random effects:
 Groups   Name        Variance Std.Dev.
 Month    (Intercept) 0.14591  0.38198
 Block    (Intercept) 0.58690  0.76609
 Residual             2.79816  1.67277
Number of obs: 160, groups: Month, 10; Block, 6

Fixed effects:
                   Estimate Std. Error t value
(Intercept)          0.3098     0.3799  0.8155
Treatment2.Radiata   0.5879     0.2299  2.5575
Treatment3.Aldabra   0.5745     0.2382  2.4117

Correlation of Fixed Effects:
            (Intr) Trt2.R
Trtmnt2.Rdt -0.347
Trtmnt3.Ald -0.348  0.536

Many thanks,
Christine

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

----------------------
Christine Griffiths
School of Biological Sciences
University of Bristol
Woodland Road
Bristol BS8 1UG
Tel: 0117 9287593
Fax 0117 925 7374
Christine.Griffiths at bristol.ac.uk
http://www.bio.bris.ac.uk/research/mammal/tortoises.html