An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20081007/4baf62f5/attachment.pl>
generalized linear mixed models: large differences when using glmmPQL or lmer with laplace approximation
8 messages · Greg Snow, Douglas Bates, Ben Bolker +2 more
Martijn, Absolute truth is the realm of pure mathematics, philosophy, and religion. R deals with statistics where we are more concerned with usefulness than absolute correctness. One famous quote among statisticians is: "All models are wrong, some models are useful". Given that, I would suggest looking at the quality of the predictions from your models. Get predicted values from the 2 different fits and compare them to each other and the original data. If one set of predictions is clearly superior, then you have your answer. If the predictions are not that different, then that can tell you that possibly there are multiple models that are equally useful for this data. Then you can look at the predictions from models fit with fewer predictors to see how that affects the predictions. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111
-----Original Message----- From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed- models-bounces at r-project.org] On Behalf Of Martijn Vandegehuchte Sent: Tuesday, October 07, 2008 7:21 AM To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] generalized linear mixed models: large differences when using glmmPQL or lmer with laplace approximation Dear list, First of all, I am a mere ecologist, trying to get the truth out of his data, and not a statistician, so forgive me my lack of statistical background and possible conceptual misunderstandings. I am currently comparing generalized linear mixed models in glmmPQL and lmer, with a quasipoisson family, and have found out that parameter estimates are quite different for both methods. I read some of the discussions on the R-forum and it seems that the Laplace approximation used in the current version of lmer is generally preferred to the PQL method. I am an ex-SAS user, and in proc glimmix in SAS the default is PQL, and the estimates and p-values are almost exact the same as with glmmPQL in R. But lmer gives quite different results, and now I am wondering what would be the best option for me. First of all, parameter estimates of a same model can be somewhat different in lmer or glmmPQL. Second of all, in lmer, I only get t- values but no associated p-values (apparently they are omitted because of the uncertainty about the df). But if I compare the t-values generated by glmmPQL with those of a same model in lmer, the differences are substantial. My dataset consists of 120 observations, so basically you could guess the order of magnitude of the p-values in lmer based on the t-value and a "large" df. First example: In lmer:
model<-
lmer(schirufu~diameter+leafvit+densroot+cover+nemcm+(1|site),family=qua sipoisson)
summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: schirufu ~ diameter + leafvit + densroot + cover + nemcm + (1
| site)
AIC BIC logLik deviance
2045 2068 -1015 2029
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 12.700 3.5638
Residual 15.182 3.8964
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.31017 1.47249 0.890
diameter -0.24799 0.29180 -0.850
leafvit 1.29007 0.21041 6.131
densroot 0.31024 0.04939 6.281
cover -0.24544 0.22179 -1.107
nemcm 0.24817 0.12028 2.063
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt cover
diameter 0.031
leafvit -0.083 0.321
densroot 0.011 -0.017 -0.202
cover 0.021 -0.448 0.016 0.214
nemcm -0.014 0.114 0.114 0.310 -0.017
Although no p-values are given, it suggests that fixed effects leafvit, densroot and nemcm would be significant. In glmmPQL:
model<-
glmmPQL(schirufu~diameter+leafvit+densroot+cover+nemcm,random=~1|site,f amily=quasipoisson) iteration 1 iteration 2 iteration 3 iteration 4 iteration 5
summary(model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.7864989 4.63591
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: schirufu ~ diameter + leafvit + densroot + cover + nemcm
Value Std.Error DF t-value p-value
(Intercept) 1.4486735 0.4174843 109 3.470007 0.0007
diameter -0.2600504 0.3477017 109 -0.747913 0.4561
leafvit 1.2236406 0.2489291 109 4.915619 0.0000
densroot 0.3236446 0.0596342 109 5.427164 0.0000
cover -0.2523163 0.2698555 109 -0.935005 0.3519
nemcm 0.2336305 0.1451751 109 1.609301 0.1104
Correlation:
(Intr) diamtr leafvt densrt cover
diameter 0.130
leafvit -0.335 0.313
densroot 0.027 -0.022 -0.203
cover 0.090 -0.463 0.015 0.214
nemcm -0.056 0.097 0.107 0.301 -0.014
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.4956188 -0.4154369 -0.1333850 0.1724601 4.7355928
Number of Observations: 120
Number of Groups: 6
Note the difference in parameter estimates. Also, the fixed effect nemcm now is not significant any more. Second example,now with an offset: In lmer:
model<-
lmer(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+(1|site), offset= loglength, family=quasipoisson)
summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: nemcm ~ diameter + leafvit + densroot + rootvit + cover +
schirufu + (1 | site)
AIC BIC logLik deviance
1593 1618 -787.4 1575
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 21.522 4.6392
Residual 173.888 13.1867
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.06733 1.92761 0.0349
diameter 0.14665 0.60693 0.2416
leafvit -0.19902 0.48802 -0.4078
densroot -0.49178 0.64221 -0.7658
rootvit 0.37699 0.46810 0.8054
cover -0.23545 0.57896 -0.4067
schirufu 0.23226 0.46866 0.4956
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.016
leafvit 0.015 0.396
densroot 0.055 -0.233 -0.291
rootvit -0.038 -0.251 -0.629 0.277
cover 0.024 -0.796 -0.133 0.253 0.117
schirufu -0.032 0.137 -0.029 -0.505 -0.078 -0.121
This suggests no significant effects at all. In glmmPQL:
model<-
glmmPQL(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+offset(l oglength),random=~1|site, family=quasipoisson) iteration 1 iteration 2 iteration 3
summary (model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.2684477 4.507758
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: nemcm ~ diameter + leafvit + densroot + rootvit + cover
+ schirufu + offset(loglength)
Value Std.Error DF t-value p-value
(Intercept) 0.1131898 0.1656949 108 0.6831220 0.4960
diameter 0.1225231 0.1976568 108 0.6198779 0.5366
leafvit -0.2191361 0.1697784 108 -1.2907181 0.1996
densroot -0.4733839 0.2221562 108 -2.1308604 0.0354
rootvit 0.3858120 0.1615706 108 2.3878846 0.0187
cover -0.2075038 0.1922054 108 -1.0795940 0.2827
schirufu 0.2028444 0.1633954 108 1.2414323 0.2171
Correlation:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.050
leafvit 0.077 0.360
densroot 0.217 -0.168 -0.262
rootvit -0.163 -0.202 -0.632 0.257
cover 0.084 -0.772 -0.098 0.200 0.073
schirufu -0.103 0.099 -0.050 -0.483 -0.068 -0.075
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.1146287 -0.5208003 -0.1927005 0.2462878 7.9755368
Number of Observations: 120
Number of Groups: 6
Again some differences in parameter estimates, but now the two fixed
effects densroot and rootvit turn out to be significant.
So my questions are:
- what would you recommend me to use? lmer or glmmPQL (laplace
approximation or penalized quasi-likelihood)?
- if lmer is the better option, is there a way to get a reliable p-
value for the fixed effects?
I have experienced that deleting a term and comparing models using
anova() always overestimates the significance of that term, probably
because the quasipoisson correction for overdispersion is not taken
into account.
Thank you very much beforehand,
Martijn.
--
Martijn Vandegehuchte
Ghent University
Department Biology
Terrestrial Ecology Unit
K.L.Ledeganckstraat 35
B-9000 Ghent
telephone: +32 (0)9/264 50 84
e-mail: martijn.vandegehuchte at ugent.be
website TEREC: www.ecology.ugent.be/terec
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Due to some travel and the need to attend to other projects, I haven't been keeping up as closely with this list as I normally do. Regarding the comparison between the PQL and Laplace methods for fitting generalized linear mixed models, I believe that the estimates produced by the Laplace method are more reliable than those from the PQL method. The objective function optimized by the Laplace method is a direct approxmation, and generally a very good approximation, to the log-likelihood for the model being fit. The PQL method is indirect (the "QL" part of the name stands for "quasi-likelihood") and, because it involves alternating conditional optimization, can alternate back-and-forth between two potential solutions, neither of which is optimal. (To be fair, such alternating occurs more frequently in the analogous method for nonlinear mixed-models, in which I was one of the co-conspirators, than in the PQL method for GLMMs.) It may be that the problem you are encountering has more to do with the use of the quasipoisson family than with the Laplace approximation. I am not sure that the derivation of the standard errors in lmer when using the quasipoisson family is correct, in part because I don't really understand the quasipoisson and quasibinomial families. As far as I know, they don't correspond to probability distributions so the theory is a bit iffy. Do you need to use the quasipoisson family or could you use the poisson family? Generally the motivation for the quasipoisson familiy is to accomodate overdispersion. Often in a generalized linear mixed model the problem is underdispersion rather than overdispersion. In one of Ben's replies in this thread he discusses the degrees of freedom attributed to certain t-statistics. Regular readers of this list are aware that degrees of freedom is one of my least favorite topics. If one has a reasonably large number of observations and a reasonably large number of groups then the issue is unimportant. (Uncertainty in degrees of freedom is important only when the value of the degrees of freedom is small. In fact, when I first started studying statistics we used the standard normal in place of the t-distribution whenever the degrees of freedom exceeded 30). Considering that the quasi-Poisson doesn't correspond to a probability distribution in the first place, (readers should feel free to correct me if I am wrong about this) I find the issue of the number of degrees of freedom that should be attributed to a distribution of a quantity calculated from a non-existent distribution to be somewhat off the point. I think the problem is more likely that the standard errors are not being calculated correctly. Is that what you concluded from your simulations, Ben? On Tue, Oct 7, 2008 at 8:21 AM, Martijn Vandegehuchte
<martijn.vandegehuchte at ugent.be> wrote:
Dear list,
First of all, I am a mere ecologist, trying to get the truth out of his data, and not a statistician, so forgive me my lack of statistical background and possible conceptual misunderstandings.
I am currently comparing generalized linear mixed models in glmmPQL and lmer, with a quasipoisson family, and have found out that parameter estimates are quite different for both methods. I read some of the discussions on the R-forum and it seems that the Laplace approximation used in the current version of lmer is generally preferred to the PQL method. I am an ex-SAS user, and in proc glimmix in SAS the default is PQL, and the estimates and p-values are almost exact the same as with glmmPQL in R. But lmer gives quite different results, and now I am wondering what would be the best option for me.
First of all, parameter estimates of a same model can be somewhat different in lmer or glmmPQL. Second of all, in lmer, I only get t-values but no associated p-values (apparently they are omitted because of the uncertainty about the df). But if I compare the t-values generated by glmmPQL with those of a same model in lmer, the differences are substantial. My dataset consists of 120 observations, so basically you could guess the order of magnitude of the p-values in lmer based on the t-value and a "large" df.
First example: In lmer:
model<-lmer(schirufu~diameter+leafvit+densroot+cover+nemcm+(1|site),family=quasipoisson) summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: schirufu ~ diameter + leafvit + densroot + cover + nemcm + (1 | site)
AIC BIC logLik deviance
2045 2068 -1015 2029
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 12.700 3.5638
Residual 15.182 3.8964
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.31017 1.47249 0.890
diameter -0.24799 0.29180 -0.850
leafvit 1.29007 0.21041 6.131
densroot 0.31024 0.04939 6.281
cover -0.24544 0.22179 -1.107
nemcm 0.24817 0.12028 2.063
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt cover
diameter 0.031
leafvit -0.083 0.321
densroot 0.011 -0.017 -0.202
cover 0.021 -0.448 0.016 0.214
nemcm -0.014 0.114 0.114 0.310 -0.017
Although no p-values are given, it suggests that fixed effects leafvit, densroot and nemcm would be significant. In glmmPQL:
model<-glmmPQL(schirufu~diameter+leafvit+densroot+cover+nemcm,random=~1|site,family=quasipoisson)
iteration 1 iteration 2 iteration 3 iteration 4 iteration 5
summary(model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.7864989 4.63591
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: schirufu ~ diameter + leafvit + densroot + cover + nemcm
Value Std.Error DF t-value p-value
(Intercept) 1.4486735 0.4174843 109 3.470007 0.0007
diameter -0.2600504 0.3477017 109 -0.747913 0.4561
leafvit 1.2236406 0.2489291 109 4.915619 0.0000
densroot 0.3236446 0.0596342 109 5.427164 0.0000
cover -0.2523163 0.2698555 109 -0.935005 0.3519
nemcm 0.2336305 0.1451751 109 1.609301 0.1104
Correlation:
(Intr) diamtr leafvt densrt cover
diameter 0.130
leafvit -0.335 0.313
densroot 0.027 -0.022 -0.203
cover 0.090 -0.463 0.015 0.214
nemcm -0.056 0.097 0.107 0.301 -0.014
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.4956188 -0.4154369 -0.1333850 0.1724601 4.7355928
Number of Observations: 120
Number of Groups: 6
Note the difference in parameter estimates. Also, the fixed effect nemcm now is not significant any more. Second example,now with an offset: In lmer:
model<-lmer(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+(1|site), offset= loglength, family=quasipoisson) summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: nemcm ~ diameter + leafvit + densroot + rootvit + cover + schirufu + (1 | site)
AIC BIC logLik deviance
1593 1618 -787.4 1575
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 21.522 4.6392
Residual 173.888 13.1867
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.06733 1.92761 0.0349
diameter 0.14665 0.60693 0.2416
leafvit -0.19902 0.48802 -0.4078
densroot -0.49178 0.64221 -0.7658
rootvit 0.37699 0.46810 0.8054
cover -0.23545 0.57896 -0.4067
schirufu 0.23226 0.46866 0.4956
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.016
leafvit 0.015 0.396
densroot 0.055 -0.233 -0.291
rootvit -0.038 -0.251 -0.629 0.277
cover 0.024 -0.796 -0.133 0.253 0.117
schirufu -0.032 0.137 -0.029 -0.505 -0.078 -0.121
This suggests no significant effects at all. In glmmPQL:
model<-glmmPQL(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+offset(loglength),random=~1|site, family=quasipoisson)
iteration 1 iteration 2 iteration 3
summary (model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.2684477 4.507758
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: nemcm ~ diameter + leafvit + densroot + rootvit + cover + schirufu + offset(loglength)
Value Std.Error DF t-value p-value
(Intercept) 0.1131898 0.1656949 108 0.6831220 0.4960
diameter 0.1225231 0.1976568 108 0.6198779 0.5366
leafvit -0.2191361 0.1697784 108 -1.2907181 0.1996
densroot -0.4733839 0.2221562 108 -2.1308604 0.0354
rootvit 0.3858120 0.1615706 108 2.3878846 0.0187
cover -0.2075038 0.1922054 108 -1.0795940 0.2827
schirufu 0.2028444 0.1633954 108 1.2414323 0.2171
Correlation:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.050
leafvit 0.077 0.360
densroot 0.217 -0.168 -0.262
rootvit -0.163 -0.202 -0.632 0.257
cover 0.084 -0.772 -0.098 0.200 0.073
schirufu -0.103 0.099 -0.050 -0.483 -0.068 -0.075
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.1146287 -0.5208003 -0.1927005 0.2462878 7.9755368
Number of Observations: 120
Number of Groups: 6
Again some differences in parameter estimates, but now the two fixed effects densroot and rootvit turn out to be significant.
So my questions are:
- what would you recommend me to use? lmer or glmmPQL (laplace approximation or penalized quasi-likelihood)?
- if lmer is the better option, is there a way to get a reliable p-value for the fixed effects?
I have experienced that deleting a term and comparing models using anova() always overestimates the significance of that term, probably because the quasipoisson correction for overdispersion is not taken into account.
Thank you very much beforehand,
Martijn.
--
Martijn Vandegehuchte
Ghent University
Department Biology
Terrestrial Ecology Unit
K.L.Ledeganckstraat 35
B-9000 Ghent
telephone: +32 (0)9/264 50 84
e-mail: martijn.vandegehuchte at ugent.be
website TEREC: www.ecology.ugent.be/terec
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
There is a large difference between the estimated std of the random effect, usually a sign that the glmmPQL approximation isn't working. I would try using adaptive Gauss-Hermite. Set the nAGQ parameter to increasing values until the results stabilise. Ken
On 08/10/2008, at 12:21 AM, Martijn Vandegehuchte wrote:
Dear list, First of all, I am a mere ecologist, trying to get the truth out of his data, and not a statistician, so forgive me my lack of statistical background and possible conceptual misunderstandings. I am currently comparing generalized linear mixed models in glmmPQL and lmer, with a quasipoisson family, and have found out that parameter estimates are quite different for both methods. I read some of the discussions on the R-forum and it seems that the Laplace approximation used in the current version of lmer is generally preferred to the PQL method. I am an ex-SAS user, and in proc glimmix in SAS the default is PQL, and the estimates and p-values are almost exact the same as with glmmPQL in R. But lmer gives quite different results, and now I am wondering what would be the best option for me. First of all, parameter estimates of a same model can be somewhat different in lmer or glmmPQL. Second of all, in lmer, I only get t- values but no associated p-values (apparently they are omitted because of the uncertainty about the df). But if I compare the t- values generated by glmmPQL with those of a same model in lmer, the differences are substantial. My dataset consists of 120 observations, so basically you could guess the order of magnitude of the p-values in lmer based on the t-value and a "large" df. First example: In lmer:
model<-lmer(schirufu~diameter+leafvit+densroot+cover+nemcm+(1| site),family=quasipoisson) summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: schirufu ~ diameter + leafvit + densroot + cover + nemcm +
(1 | site)
AIC BIC logLik deviance
2045 2068 -1015 2029
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 12.700 3.5638
Residual 15.182 3.8964
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.31017 1.47249 0.890
diameter -0.24799 0.29180 -0.850
leafvit 1.29007 0.21041 6.131
densroot 0.31024 0.04939 6.281
cover -0.24544 0.22179 -1.107
nemcm 0.24817 0.12028 2.063
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt cover
diameter 0.031
leafvit -0.083 0.321
densroot 0.011 -0.017 -0.202
cover 0.021 -0.448 0.016 0.214
nemcm -0.014 0.114 0.114 0.310 -0.017
Although no p-values are given, it suggests that fixed effects leafvit, densroot and nemcm would be significant. In glmmPQL:
model<-glmmPQL(schirufu~diameter+leafvit+densroot+cover +nemcm,random=~1|site,family=quasipoisson)
iteration 1 iteration 2 iteration 3 iteration 4 iteration 5
summary(model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.7864989 4.63591
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: schirufu ~ diameter + leafvit + densroot + cover +
nemcm
Value Std.Error DF t-value p-value
(Intercept) 1.4486735 0.4174843 109 3.470007 0.0007
diameter -0.2600504 0.3477017 109 -0.747913 0.4561
leafvit 1.2236406 0.2489291 109 4.915619 0.0000
densroot 0.3236446 0.0596342 109 5.427164 0.0000
cover -0.2523163 0.2698555 109 -0.935005 0.3519
nemcm 0.2336305 0.1451751 109 1.609301 0.1104
Correlation:
(Intr) diamtr leafvt densrt cover
diameter 0.130
leafvit -0.335 0.313
densroot 0.027 -0.022 -0.203
cover 0.090 -0.463 0.015 0.214
nemcm -0.056 0.097 0.107 0.301 -0.014
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.4956188 -0.4154369 -0.1333850 0.1724601 4.7355928
Number of Observations: 120
Number of Groups: 6
Note the difference in parameter estimates. Also, the fixed effect nemcm now is not significant any more. Second example,now with an offset: In lmer:
model<-lmer(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu +(1|site), offset= loglength, family=quasipoisson) summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: nemcm ~ diameter + leafvit + densroot + rootvit + cover +
schirufu + (1 | site)
AIC BIC logLik deviance
1593 1618 -787.4 1575
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 21.522 4.6392
Residual 173.888 13.1867
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.06733 1.92761 0.0349
diameter 0.14665 0.60693 0.2416
leafvit -0.19902 0.48802 -0.4078
densroot -0.49178 0.64221 -0.7658
rootvit 0.37699 0.46810 0.8054
cover -0.23545 0.57896 -0.4067
schirufu 0.23226 0.46866 0.4956
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.016
leafvit 0.015 0.396
densroot 0.055 -0.233 -0.291
rootvit -0.038 -0.251 -0.629 0.277
cover 0.024 -0.796 -0.133 0.253 0.117
schirufu -0.032 0.137 -0.029 -0.505 -0.078 -0.121
This suggests no significant effects at all. In glmmPQL:
model<-glmmPQL(nemcm~diameter+leafvit+densroot+rootvit+cover +schirufu+offset(loglength),random=~1|site, family=quasipoisson)
iteration 1 iteration 2 iteration 3
summary (model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.2684477 4.507758
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: nemcm ~ diameter + leafvit + densroot + rootvit +
cover + schirufu + offset(loglength)
Value Std.Error DF t-value p-value
(Intercept) 0.1131898 0.1656949 108 0.6831220 0.4960
diameter 0.1225231 0.1976568 108 0.6198779 0.5366
leafvit -0.2191361 0.1697784 108 -1.2907181 0.1996
densroot -0.4733839 0.2221562 108 -2.1308604 0.0354
rootvit 0.3858120 0.1615706 108 2.3878846 0.0187
cover -0.2075038 0.1922054 108 -1.0795940 0.2827
schirufu 0.2028444 0.1633954 108 1.2414323 0.2171
Correlation:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.050
leafvit 0.077 0.360
densroot 0.217 -0.168 -0.262
rootvit -0.163 -0.202 -0.632 0.257
cover 0.084 -0.772 -0.098 0.200 0.073
schirufu -0.103 0.099 -0.050 -0.483 -0.068 -0.075
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.1146287 -0.5208003 -0.1927005 0.2462878 7.9755368
Number of Observations: 120
Number of Groups: 6
Again some differences in parameter estimates, but now the two fixed effects densroot and rootvit turn out to be significant. So my questions are: - what would you recommend me to use? lmer or glmmPQL (laplace approximation or penalized quasi-likelihood)? - if lmer is the better option, is there a way to get a reliable p- value for the fixed effects? I have experienced that deleting a term and comparing models using anova() always overestimates the significance of that term, probably because the quasipoisson correction for overdispersion is not taken into account. Thank you very much beforehand, Martijn. -- Martijn Vandegehuchte Ghent University Department Biology Terrestrial Ecology Unit K.L.Ledeganckstraat 35 B-9000 Ghent telephone: +32 (0)9/264 50 84 e-mail: martijn.vandegehuchte at ugent.be website TEREC: www.ecology.ugent.be/terec [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On Tue, Oct 7, 2008 at 4:07 PM, Ken Beath <ken at kjbeath.com.au> wrote:
There is a large difference between the estimated std of the random effect, usually a sign that the glmmPQL approximation isn't working.
Or that there is a mistake in the calculation of the standard errors for the random effects, which is more likely in this case. The actual optimization is with respect to the relative standard deviation of the random effects (relative to the scale parameter in the conditional standard deviation of the response). For the Poisson family or the binomial family that scale parameter is fixed at 1 (you could also consider the situation to be that there isn't a scale parameter in those cases). For the quasipoisson and quasibinomial families you maybe estimate a value there or maybe not. I don't know. I believe Ben's simulations showed that I was doing the wrong thing there.
I would try using adaptive Gauss-Hermite. Set the nAGQ parameter to increasing values until the results stabilise. Ken On 08/10/2008, at 12:21 AM, Martijn Vandegehuchte wrote:
Dear list, First of all, I am a mere ecologist, trying to get the truth out of his data, and not a statistician, so forgive me my lack of statistical background and possible conceptual misunderstandings. I am currently comparing generalized linear mixed models in glmmPQL and lmer, with a quasipoisson family, and have found out that parameter estimates are quite different for both methods. I read some of the discussions on the R-forum and it seems that the Laplace approximation used in the current version of lmer is generally preferred to the PQL method. I am an ex-SAS user, and in proc glimmix in SAS the default is PQL, and the estimates and p-values are almost exact the same as with glmmPQL in R. But lmer gives quite different results, and now I am wondering what would be the best option for me. First of all, parameter estimates of a same model can be somewhat different in lmer or glmmPQL. Second of all, in lmer, I only get t-values but no associated p-values (apparently they are omitted because of the uncertainty about the df). But if I compare the t-values generated by glmmPQL with those of a same model in lmer, the differences are substantial. My dataset consists of 120 observations, so basically you could guess the order of magnitude of the p-values in lmer based on the t-value and a "large" df. First example: In lmer:
model<-lmer(schirufu~diameter+leafvit+densroot+cover+nemcm+(1|site),family=quasipoisson) summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: schirufu ~ diameter + leafvit + densroot + cover + nemcm + (1 |
site)
AIC BIC logLik deviance
2045 2068 -1015 2029
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 12.700 3.5638
Residual 15.182 3.8964
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.31017 1.47249 0.890
diameter -0.24799 0.29180 -0.850
leafvit 1.29007 0.21041 6.131
densroot 0.31024 0.04939 6.281
cover -0.24544 0.22179 -1.107
nemcm 0.24817 0.12028 2.063
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt cover
diameter 0.031
leafvit -0.083 0.321
densroot 0.011 -0.017 -0.202
cover 0.021 -0.448 0.016 0.214
nemcm -0.014 0.114 0.114 0.310 -0.017
Although no p-values are given, it suggests that fixed effects leafvit, densroot and nemcm would be significant. In glmmPQL:
model<-glmmPQL(schirufu~diameter+leafvit+densroot+cover+nemcm,random=~1|site,family=quasipoisson)
iteration 1 iteration 2 iteration 3 iteration 4 iteration 5
summary(model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.7864989 4.63591
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: schirufu ~ diameter + leafvit + densroot + cover + nemcm
Value Std.Error DF t-value p-value
(Intercept) 1.4486735 0.4174843 109 3.470007 0.0007
diameter -0.2600504 0.3477017 109 -0.747913 0.4561
leafvit 1.2236406 0.2489291 109 4.915619 0.0000
densroot 0.3236446 0.0596342 109 5.427164 0.0000
cover -0.2523163 0.2698555 109 -0.935005 0.3519
nemcm 0.2336305 0.1451751 109 1.609301 0.1104
Correlation:
(Intr) diamtr leafvt densrt cover
diameter 0.130
leafvit -0.335 0.313
densroot 0.027 -0.022 -0.203
cover 0.090 -0.463 0.015 0.214
nemcm -0.056 0.097 0.107 0.301 -0.014
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.4956188 -0.4154369 -0.1333850 0.1724601 4.7355928
Number of Observations: 120
Number of Groups: 6
Note the difference in parameter estimates. Also, the fixed effect nemcm now is not significant any more. Second example,now with an offset: In lmer:
model<-lmer(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+(1|site), offset= loglength, family=quasipoisson) summary (model)
Generalized linear mixed model fit by the Laplace approximation
Formula: nemcm ~ diameter + leafvit + densroot + rootvit + cover +
schirufu + (1 | site)
AIC BIC logLik deviance
1593 1618 -787.4 1575
Random effects:
Groups Name Variance Std.Dev.
site (Intercept) 21.522 4.6392
Residual 173.888 13.1867
Number of obs: 120, groups: site, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.06733 1.92761 0.0349
diameter 0.14665 0.60693 0.2416
leafvit -0.19902 0.48802 -0.4078
densroot -0.49178 0.64221 -0.7658
rootvit 0.37699 0.46810 0.8054
cover -0.23545 0.57896 -0.4067
schirufu 0.23226 0.46866 0.4956
Correlation of Fixed Effects:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.016
leafvit 0.015 0.396
densroot 0.055 -0.233 -0.291
rootvit -0.038 -0.251 -0.629 0.277
cover 0.024 -0.796 -0.133 0.253 0.117
schirufu -0.032 0.137 -0.029 -0.505 -0.078 -0.121
This suggests no significant effects at all. In glmmPQL:
model<-glmmPQL(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+offset(loglength),random=~1|site, family=quasipoisson)
iteration 1 iteration 2 iteration 3
summary (model)
Linear mixed-effects model fit by maximum likelihood
Data: NULL
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | site
(Intercept) Residual
StdDev: 0.2684477 4.507758
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: nemcm ~ diameter + leafvit + densroot + rootvit + cover +
schirufu + offset(loglength)
Value Std.Error DF t-value p-value
(Intercept) 0.1131898 0.1656949 108 0.6831220 0.4960
diameter 0.1225231 0.1976568 108 0.6198779 0.5366
leafvit -0.2191361 0.1697784 108 -1.2907181 0.1996
densroot -0.4733839 0.2221562 108 -2.1308604 0.0354
rootvit 0.3858120 0.1615706 108 2.3878846 0.0187
cover -0.2075038 0.1922054 108 -1.0795940 0.2827
schirufu 0.2028444 0.1633954 108 1.2414323 0.2171
Correlation:
(Intr) diamtr leafvt densrt rootvt cover
diameter -0.050
leafvit 0.077 0.360
densroot 0.217 -0.168 -0.262
rootvit -0.163 -0.202 -0.632 0.257
cover 0.084 -0.772 -0.098 0.200 0.073
schirufu -0.103 0.099 -0.050 -0.483 -0.068 -0.075
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.1146287 -0.5208003 -0.1927005 0.2462878 7.9755368
Number of Observations: 120
Number of Groups: 6
Again some differences in parameter estimates, but now the two fixed
effects densroot and rootvit turn out to be significant.
So my questions are:
- what would you recommend me to use? lmer or glmmPQL (laplace
approximation or penalized quasi-likelihood)?
- if lmer is the better option, is there a way to get a reliable p-value
for the fixed effects?
I have experienced that deleting a term and comparing models using anova()
always overestimates the significance of that term, probably because the
quasipoisson correction for overdispersion is not taken into account.
Thank you very much beforehand,
Martijn.
--
Martijn Vandegehuchte
Ghent University
Department Biology
Terrestrial Ecology Unit
K.L.Ledeganckstraat 35
B-9000 Ghent
telephone: +32 (0)9/264 50 84
e-mail: martijn.vandegehuchte at ugent.be
website TEREC: www.ecology.ugent.be/terec
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Douglas Bates wrote:
[snip]
It may be that the problem you are encountering has more to do with the use of the quasipoisson family than with the Laplace approximation. I am not sure that the derivation of the standard errors in lmer when using the quasipoisson family is correct, in part because I don't really understand the quasipoisson and quasibinomial families. As far as I know, they don't correspond to probability distributions so the theory is a bit iffy. Do you need to use the quasipoisson family or could you use the poisson family? Generally the motivation for the quasipoisson familiy is to accomodate overdispersion. Often in a generalized linear mixed model the problem is underdispersion rather than overdispersion.
In ecological data it's quite common to have dispersion remaining in a GLMM even after accounting for known grouping factors.
In one of Ben's replies in this thread he discusses the degrees of freedom attributed to certain t-statistics. Regular readers of this list are aware that degrees of freedom is one of my least favorite topics. If one has a reasonably large number of observations and a reasonably large number of groups then the issue is unimportant. (Uncertainty in degrees of freedom is important only when the value of the degrees of freedom is small. In fact, when I first started studying statistics we used the standard normal in place of the t-distribution whenever the degrees of freedom exceeded 30). Considering that the quasi-Poisson doesn't correspond to a probability distribution in the first place, (readers should feel free to correct me if I am wrong about this) I find the issue of the number of degrees of freedom that should be attributed to a distribution of a quantity calculated from a non-existent distribution to be somewhat off the point.
Fair enough, but backing up to wanting p-values associated with (sensible???) hypothesis tests -- we don't have mcmcsamp for GLMMs, so our options are (1) no p-values at all, (2) Z tests (i.e. don't worry about uncertainty in the scale parameter estimate, (3) ?? simulation from the null hypothesis. See http://www.zoo.ufl.edu/bolker/glmm/glmersim.pdf
I think the problem is more likely that the standard errors are not being calculated correctly. Is that what you concluded from your simulations, Ben?
Yes. http://www.zoo.ufl.edu/bolker/glmm/quasitest.pdf By the way, I'd be more than happy for any input on the above-referenced URLs -- if anyone thinks (and can argue reasonably convincingly) that they're wrong and/or misguided I'll take them down so as not to mislead the public. The Sweave files are up in the same place (substitute .Rnw for .pdf) cheers Ben
Thank you for this explanation. My dataset consists of 120 observations, with 6 levels of the random effect (20 observations per level). I think in my case the df, like you say, are not the point, because the number is rather large. Considering the quasipoisson family, I do not really know how it works or what it does (like I said, I am not a statistician at all). I only learned that it is a way to deal with overdispersion, and yes, my data are highly overdispersed. If I make the same models with lmer with a poisson family, all the effects become highly significant, so the correction seems necessary. Another remark: by now I found out that when I use poisson or quasipoisson, the results with glmmPQL are exactly the same, and still quite close to the quasipoisson results of lmer (as opposed to the poisson results of lmer). I have been searching for information about the quasi families, but I can't seem to find any. I'm a bit puzzled by this at the moment. Why is there a huge difference between quasipoisson and poisson with lmer and not with glmmPQL? It seems that glmmPQL already accounts for the overdispersion, also because when I compare the results with the output of SAS proc glimmix, the results of glmmPQL are exactly the same only if I put the overdispersion correction "random _residual_" in SAS. Also, the overdispersion in my data is to a large extent due to the amount of zeroes. I have been searching for ways to build ZIP models, but in SAS with proc nlmixed, it is quite complex, and with glmm.admb() in R, I also get strange results and warning messages, and I have problems with obtaining the output I need. Now a colleague from my lab just pointed out that a glmm with a correction for overdispersion gave very similar results compared with ZIP models in a study of his. And besides that, the overdispersion in my case is not only due to the zeroes, there are also some large values in my dependent variables, so I'm not even sure if a ZIP model is the right way to deal with it. But maybe it is a way to avoid the quasifamilies, if they should be avoided at all. It's just a thought. I would appreciate your ideas on these matters, Thank you for your time and effort, Martijn.
Martijn Vandegehuchte Ghent University Department Biology Terrestrial Ecology Unit K.L.Ledeganckstraat 35 B-9000 Ghent telephone: +32 (0)9/264 50 84 e-mail: martijn.vandegehuchte at ugent.be website TEREC: www.ecology.ugent.be/terec ----- Original Message ----- From: "Douglas Bates" <bates at stat.wisc.edu> To: "Martijn Vandegehuchte" <martijn.vandegehuchte at ugent.be> Cc: <r-sig-mixed-models at r-project.org> Sent: Tuesday, October 07, 2008 10:59 PM Subject: Re: [R-sig-ME] generalized linear mixed models: large differences when using glmmPQL or lmer with laplace approximation > Due to some travel and the need to attend to other projects, I haven't > been keeping up as closely with this list as I normally do. Regarding > the comparison between the PQL and Laplace methods for fitting > generalized linear mixed models, I believe that the estimates produced > by the Laplace method are more reliable than those from the PQL > method. The objective function optimized by the Laplace method is a > direct approxmation, and generally a very good approximation, to the > log-likelihood for the model being fit. The PQL method is indirect > (the "QL" part of the name stands for "quasi-likelihood") and, because > it involves alternating conditional optimization, can alternate > back-and-forth between two potential solutions, neither of which is > optimal. (To be fair, such alternating occurs more frequently in the > analogous method for nonlinear mixed-models, in which I was one of the > co-conspirators, than in the PQL method for GLMMs.) > > It may be that the problem you are encountering has more to do with > the use of the quasipoisson family than with the Laplace > approximation. I am not sure that the derivation of the standard > errors in lmer when using the quasipoisson family is correct, in part > because I don't really understand the quasipoisson and quasibinomial > families. As far as I know, they don't correspond to probability > distributions so the theory is a bit iffy. > > Do you need to use the quasipoisson family or could you use the > poisson family? Generally the motivation for the quasipoisson familiy > is to accomodate overdispersion. Often in a generalized linear mixed > model the problem is underdispersion rather than overdispersion. > > In one of Ben's replies in this thread he discusses the degrees of > freedom attributed to certain t-statistics. Regular readers of this > list are aware that degrees of freedom is one of my least favorite > topics. If one has a reasonably large number of observations and a > reasonably large number of groups then the issue is unimportant. > (Uncertainty in degrees of freedom is important only when the value of > the degrees of freedom is small. In fact, when I first started > studying statistics we used the standard normal in place of the > t-distribution whenever the degrees of freedom exceeded 30). > Considering that the quasi-Poisson doesn't correspond to a probability > distribution in the first place, (readers should feel free to correct > me if I am wrong about this) I find the issue of the number of degrees > of freedom that should be attributed to a distribution of a quantity > calculated from a non-existent distribution to be somewhat off the > point. > > I think the problem is more likely that the standard errors are not > being calculated correctly. Is that what you concluded from your > simulations, Ben? > > On Tue, Oct 7, 2008 at 8:21 AM, Martijn Vandegehuchte > <martijn.vandegehuchte at ugent.be> wrote: >> Dear list, > >> First of all, I am a mere ecologist, trying to get the truth out of his >> data, and not a statistician, so forgive me my lack of statistical >> background and possible conceptual misunderstandings. > >> I am currently comparing generalized linear mixed models in glmmPQL and >> lmer, with a quasipoisson family, and have found out that parameter >> estimates are quite different for both methods. I read some of the >> discussions on the R-forum and it seems that the Laplace approximation >> used in the current version of lmer is generally preferred to the PQL >> method. I am an ex-SAS user, and in proc glimmix in SAS the default is >> PQL, and the estimates and p-values are almost exact the same as with >> glmmPQL in R. But lmer gives quite different results, and now I am >> wondering what would be the best option for me. > >> First of all, parameter estimates of a same model can be somewhat >> different in lmer or glmmPQL. Second of all, in lmer, I only get t-values >> but no associated p-values (apparently they are omitted because of the >> uncertainty about the df). But if I compare the t-values generated by >> glmmPQL with those of a same model in lmer, the differences are >> substantial. My dataset consists of 120 observations, so basically you >> could guess the order of magnitude of the p-values in lmer based on the >> t-value and a "large" df. > >> First example: >> In lmer: >> >>> model<-lmer(schirufu~diameter+leafvit+densroot+cover+nemcm+(1|site),family=quasipoisson) >>> summary (model) >> Generalized linear mixed model fit by the Laplace approximation >> Formula: schirufu ~ diameter + leafvit + densroot + cover + nemcm + (1 | >> site) >> AIC BIC logLik deviance >> 2045 2068 -1015 2029 >> Random effects: >> Groups Name Variance Std.Dev. >> site (Intercept) 12.700 3.5638 >> Residual 15.182 3.8964 >> Number of obs: 120, groups: site, 6 >> >> Fixed effects: >> Estimate Std. Error t value >> (Intercept) 1.31017 1.47249 0.890 >> diameter -0.24799 0.29180 -0.850 >> leafvit 1.29007 0.21041 6.131 >> densroot 0.31024 0.04939 6.281 >> cover -0.24544 0.22179 -1.107 >> nemcm 0.24817 0.12028 2.063 >> >> Correlation of Fixed Effects: >> (Intr) diamtr leafvt densrt cover >> diameter 0.031 >> leafvit -0.083 0.321 >> densroot 0.011 -0.017 -0.202 >> cover 0.021 -0.448 0.016 0.214 >> nemcm -0.014 0.114 0.114 0.310 -0.017 >>> >> >> Although no p-values are given, it suggests that fixed effects leafvit, >> densroot and nemcm would be significant. >> In glmmPQL: >> >>> model<-glmmPQL(schirufu~diameter+leafvit+densroot+cover+nemcm,random=~1|site,family=quasipoisson) >> iteration 1 >> iteration 2 >> iteration 3 >> iteration 4 >> iteration 5 >>> summary(model) >> Linear mixed-effects model fit by maximum likelihood >> Data: NULL >> AIC BIC logLik >> NA NA NA >> >> Random effects: >> Formula: ~1 | site >> (Intercept) Residual >> StdDev: 0.7864989 4.63591 >> >> Variance function: >> Structure: fixed weights >> Formula: ~invwt >> Fixed effects: schirufu ~ diameter + leafvit + densroot + cover + nemcm >> Value Std.Error DF t-value p-value >> (Intercept) 1.4486735 0.4174843 109 3.470007 0.0007 >> diameter -0.2600504 0.3477017 109 -0.747913 0.4561 >> leafvit 1.2236406 0.2489291 109 4.915619 0.0000 >> densroot 0.3236446 0.0596342 109 5.427164 0.0000 >> cover -0.2523163 0.2698555 109 -0.935005 0.3519 >> nemcm 0.2336305 0.1451751 109 1.609301 0.1104 >> Correlation: >> (Intr) diamtr leafvt densrt cover >> diameter 0.130 >> leafvit -0.335 0.313 >> densroot 0.027 -0.022 -0.203 >> cover 0.090 -0.463 0.015 0.214 >> nemcm -0.056 0.097 0.107 0.301 -0.014 >> >> Standardized Within-Group Residuals: >> Min Q1 Med Q3 Max >> -2.4956188 -0.4154369 -0.1333850 0.1724601 4.7355928 >> >> Number of Observations: 120 >> Number of Groups: 6 >>> >> >> Note the difference in parameter estimates. Also, the fixed effect nemcm >> now is not significant any more. >> >> Second example,now with an offset: >> In lmer: >> >>> model<-lmer(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+(1|site), >>> offset= loglength, family=quasipoisson) >>> summary (model) >> Generalized linear mixed model fit by the Laplace approximation >> Formula: nemcm ~ diameter + leafvit + densroot + rootvit + cover + >> schirufu + (1 | site) >> AIC BIC logLik deviance >> 1593 1618 -787.4 1575 >> Random effects: >> Groups Name Variance Std.Dev. >> site (Intercept) 21.522 4.6392 >> Residual 173.888 13.1867 >> Number of obs: 120, groups: site, 6 >> >> Fixed effects: >> Estimate Std. Error t value >> (Intercept) 0.06733 1.92761 0.0349 >> diameter 0.14665 0.60693 0.2416 >> leafvit -0.19902 0.48802 -0.4078 >> densroot -0.49178 0.64221 -0.7658 >> rootvit 0.37699 0.46810 0.8054 >> cover -0.23545 0.57896 -0.4067 >> schirufu 0.23226 0.46866 0.4956 >> >> Correlation of Fixed Effects: >> (Intr) diamtr leafvt densrt rootvt cover >> diameter -0.016 >> leafvit 0.015 0.396 >> densroot 0.055 -0.233 -0.291 >> rootvit -0.038 -0.251 -0.629 0.277 >> cover 0.024 -0.796 -0.133 0.253 0.117 >> schirufu -0.032 0.137 -0.029 -0.505 -0.078 -0.121 >>> >> >> This suggests no significant effects at all. >> In glmmPQL: >> >>> model<-glmmPQL(nemcm~diameter+leafvit+densroot+rootvit+cover+schirufu+offset(loglength),random=~1|site, >>> family=quasipoisson) >> iteration 1 >> iteration 2 >> iteration 3 >>> summary (model) >> Linear mixed-effects model fit by maximum likelihood >> Data: NULL >> AIC BIC logLik >> NA NA NA >> >> Random effects: >> Formula: ~1 | site >> (Intercept) Residual >> StdDev: 0.2684477 4.507758 >> >> Variance function: >> Structure: fixed weights >> Formula: ~invwt >> Fixed effects: nemcm ~ diameter + leafvit + densroot + rootvit + cover + >> schirufu + offset(loglength) >> Value Std.Error DF t-value p-value >> (Intercept) 0.1131898 0.1656949 108 0.6831220 0.4960 >> diameter 0.1225231 0.1976568 108 0.6198779 0.5366 >> leafvit -0.2191361 0.1697784 108 -1.2907181 0.1996 >> densroot -0.4733839 0.2221562 108 -2.1308604 0.0354 >> rootvit 0.3858120 0.1615706 108 2.3878846 0.0187 >> cover -0.2075038 0.1922054 108 -1.0795940 0.2827 >> schirufu 0.2028444 0.1633954 108 1.2414323 0.2171 >> Correlation: >> (Intr) diamtr leafvt densrt rootvt cover >> diameter -0.050 >> leafvit 0.077 0.360 >> densroot 0.217 -0.168 -0.262 >> rootvit -0.163 -0.202 -0.632 0.257 >> cover 0.084 -0.772 -0.098 0.200 0.073 >> schirufu -0.103 0.099 -0.050 -0.483 -0.068 -0.075 >> >> Standardized Within-Group Residuals: >> Min Q1 Med Q3 Max >> -1.1146287 -0.5208003 -0.1927005 0.2462878 7.9755368 >> >> Number of Observations: 120 >> Number of Groups: 6 >>> >> >> Again some differences in parameter estimates, but now the two fixed >> effects densroot and rootvit turn out to be significant. >> So my questions are: >> - what would you recommend me to use? lmer or glmmPQL (laplace >> approximation or penalized quasi-likelihood)? >> - if lmer is the better option, is there a way to get a reliable p-value >> for the fixed effects? >> I have experienced that deleting a term and comparing models using >> anova() always overestimates the significance of that term, probably >> because the quasipoisson correction for overdispersion is not taken into >> account. >> >> Thank you very much beforehand, >> >> Martijn. >> >> -- >> Martijn Vandegehuchte >> Ghent University >> Department Biology >> Terrestrial Ecology Unit >> K.L.Ledeganckstraat 35 >> B-9000 Ghent >> telephone: +32 (0)9/264 50 84 >> e-mail: martijn.vandegehuchte at ugent.be >> >> website TEREC: www.ecology.ugent.be/terec >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> R-sig-mixed-models at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models >> >
2 days later
On 08/10/2008, at 8:18 AM, Douglas Bates wrote:
On Tue, Oct 7, 2008 at 4:07 PM, Ken Beath <ken at kjbeath.com.au> wrote:
There is a large difference between the estimated std of the random effect, usually a sign that the glmmPQL approximation isn't working.
Or that there is a mistake in the calculation of the standard errors for the random effects, which is more likely in this case. The actual optimization is with respect to the relative standard deviation of the random effects (relative to the scale parameter in the conditional standard deviation of the response). For the Poisson family or the binomial family that scale parameter is fixed at 1 (you could also consider the situation to be that there isn't a scale parameter in those cases). For the quasipoisson and quasibinomial families you maybe estimate a value there or maybe not. I don't know. I believe Ben's simulations showed that I was doing the wrong thing there
Definitely something wrong. I did some simulations of my own using
Poisson distributed data. The standard error of the fixed effects also
seems rather large.
> nsubj <- 100
> npersubj <- 20
>
> subject <- factor(rep(1:nsubj,each=npersubj))
>
> means <- exp(rep(10+rnorm(nsubj),each=npersubj))
>
> y <- rpois(nsubj*npersubj,means)
>
> simdata <- data.frame(y,subject)
>
> lmer1 <- lmer(y~(1|subject),data=simdata,family=poisson)
> summary(lmer1)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ (1 | subject)
Data: simdata
AIC BIC logLik deviance
3329 3341 -1663 3325
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 0.9102 0.95405
Number of obs: 2000, groups: subject, 100
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 9.9734 0.0954 104.5 <2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> lmer2 <- lmer(y~(1|subject),data=simdata,family=quasipoisson)
> summary(lmer2)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ (1 | subject)
Data: simdata
AIC BIC logLik deviance
3331 3348 -1663 3325
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 11794 108.60
Residual 12957 113.83
Number of obs: 2000, groups: subject, 100
Fixed effects:
Estimate Std. Error t value
(Intercept) 9.973 10.860 0.9184
>
Ken