Skip to content

glmmPQL Help: Random Effect and Dispersion Parameter

7 messages · Dennis Murphy, Douglas Bates, Reinhold Kliegl +1 more

#
Dear R users,

I am currently doing a project in generalized mixed model, and I find
the R function glmmPQL in MASS can do this via PQL. But I am a newbie
in R, the input and output of glmmPQL confused me, and I wish I can
get some answers here.

The model I used is a typical two-way generalized mixed model with
random subject (row) and block (column) effects and log link function,
y_{ij} = exp(\mu+\alpha_i+\beta_j)+\epsilon.

I can generate a pseudo data by the following R code

===================================================
k <- 5; # Number of Blocks (Columns)
n <- 100; # Number of Subjects (Rows)
m <- 3; # Number of Replications in Each Cell

sigma.a <- 0.5; # Var of Subjects Effects
sigma.b <- 0.1; # Var of Block Effects
sigma.e <- 0.01; # Var of Errors
mu <- 1; # Overall mean

a <- rep(rnorm(n,0,sigma.a),each=k*m);
b <- rep(rep(rnorm(k,0,sigma.b),each=m),n);

# Simulate responses y_{ij}=exp(\mu+\alpha_i+\beta_j)+e
y <- exp(mu+a+b)+rnorm(1,0,sigma.e);

# Indicator vector of subject effects alpha
alpha <- rep(seq(1,n),each=k*m);

# Indicator vector of block effects beta
beta <- rep(rep(seq(1:k),each=m),n);

simu <- data.frame(y,beta,alpha)
===================================================

And I want to use glmmPQL to estimate the mean and variance
components, but I have several questions.

1. How to write the random effect formula?
I have tried
glm <- glmmPQL(y~1,random=~alpha+beta,family=gaussian(link="log"),data=simu);
but it did not work and got the error message "Invalid formula for groups".

And the command
glm <- glmmPQL(y~1,random=~1|alpha/beta,family=gaussian(link="log"),data=simu)
worked, but the result was the nested "beta %in% alpha" variances,
which was not what I want.

2. How to find the estimated variance-covariance matrix for the
variance components, which should be the inverse of information
matrix.
I notice the output variable glm at apVar will give a similar
variance-covariance matrix, but it has the prefix "reStruct." and
attribute "Pars", what are these stand for? I can't find any
explanation in the help document.

3. I am also wondering if there is a way to calculate the dispersion
parameter or not?

Anyone has some tips? Any suggestions will be greatly appreciated.

Best,

Yue Yu
#
Hi:

glmmPQL has been around a while, and I suspect it was not meant to
handle crossed random effects. This was one of the original
motivations for the lme4 package, and it seems to work there, although
it's using Gauss-Hermite approximations to the likelihood rather than
PQL:

library(lme4)
mod1 <- lmer(y ~ 1 + (1 | beta) + (1 | alpha), data = simu)
Linear mixed model fit by REML ['summary.mer']
Formula: y ~ 1 + (1 | beta) + (1 | alpha)
   Data: simu
REML criterion at convergence: 584.4204

Random effects:
 Groups   Name        Variance Std.Dev.
 alpha    (Intercept) 3.11128  1.7639
 beta     (Intercept) 0.17489  0.4182
 Residual             0.05405  0.2325
Number of obs: 1500, groups: alpha, 100; beta, 5

Fixed effects:
            Estimate Std. Error t value
(Intercept)   2.9167     0.2572   11.34

Hopefully that's closer to what you had in mind. If not, take a look
at Ben Bolker's GLMM wiki:

http://glmm.wikidot.com/faq

BTW, thank you for the nice reproducible example.

Dennis
On Thu, Jun 23, 2011 at 9:16 PM, Yue Yu <parn.yy at gmail.com> wrote:
#
Thanks a lot, Dennis.

The reason I am not using lme4 is that its estimated variance
component seems not desirable.

If you use my simulated data and run the following line

glm2 <- glmer(y~1+(1|alpha)+(1|beta),family=gaussian(link="log"),data=simu)

the result will be

Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ 1 + (1 | alpha) + (1 | beta)
   Data: simu
 AIC   BIC logLik deviance
 508 529.2   -250      500
Random effects:
 Groups   Name        Variance   Std.Dev.
 alpha    (Intercept) 0.01957449 0.139909
 beta     (Intercept) 0.00080326 0.028342
 Residual             0.06888192 0.262454
Number of obs: 1500, groups: alpha, 100; beta, 5

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.13506    0.01907   59.52

The Std.Dev of alpha, beta and residual is far way from the true value
in my simulation. While glmmPQL will give a better result.

But I still need to find the variance matrix for variance components
and the dispersion parameter, any suggestions?

Yue Yu
On Fri, Jun 24, 2011 at 09:52, Dennis Murphy <djmuser at gmail.com> wrote:
#
On Fri, Jun 24, 2011 at 2:28 PM, Yue Yu <parn.yy at gmail.com> wrote:
Hmm, in lme4a the results, shown in the enclosed, are much closer to
the values used for simulation.  However, this example does show a
deficiency in this implementation of glmer in that the estimate of the
residual standard deviation is not calculated and not shown here.
The best suggestion is don't do it.  Estimates of variance components
are not symmetrically distributed (think of the simplest case of the
estimator of the variance from an i.i.d Gaussian sample, which has a
chi-squared distribution).
#
This time with the enclosure :-)
On Sat, Jun 25, 2011 at 11:52 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
-------------- next part --------------

R version 2.14.0 Under development (unstable) (2011-06-24 r56210)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
Loading required package: Matrix
Loading required package: lattice

Attaching package: ?Matrix?

The following object(s) are masked from ?package:base?:

    det

Loading required package: minqa
Loading required package: Rcpp
Loading required package: MatrixModels

Attaching package: ?MatrixModels?

The following object(s) are masked from ?package:stats?:

    getCall
npt = 4 , n =  2 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:   7:      512.232;0.600458 0.986477 
  0.0020:  14:      509.499;0.554646 0.887092 
 0.00020:  51:      490.819;0.536841 0.0779844 
 2.0e-05:  60:      490.815;0.539243 0.0788170 
 2.0e-06:  62:      490.815;0.539294 0.0787849 
 2.0e-07:  66:      490.815;0.539295 0.0787820 
At return
 72:     490.81485: 0.539295 0.0787822
npt = 5 , n =  3 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:   6:      495.473;0.539295 0.200000  1.10770 
  0.0020:  14:      491.648;0.546593 0.109993  1.11128 
 0.00020:  24:      490.787;0.539576 0.0785051  1.09634 
 2.0e-05:  28:      490.787;0.539552 0.0788343  1.09663 
 2.0e-06:  37:      490.787;0.539619 0.0787000  1.09667 
 2.0e-07:  53:      490.787;0.539613 0.0787062  1.09676 
At return
 66:     490.78710: 0.539613 0.0787075  1.09676
Generalized linear mixed model fit by maximum likelihood ['summary.mer']
 Family: gaussian 
Formula: y ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
      AIC       BIC    logLik  deviance 
 496.7871  512.7268 -245.3935  490.7871 

Random effects:
 Groups Name        Variance Std.Dev.
 alpha  (Intercept) 0.291182 0.53961 
 beta   (Intercept) 0.006195 0.07871 
Number of obs: 1500, groups: alpha, 100; beta, 5

Fixed effects:
            Estimate Std. Error z value
(Intercept)  1.09676    0.06532   16.79
user  system elapsed 
 11.780   0.270  12.248
1 day later
#
A few comments on the GLMM simulation example.

First, the simulated data use only a single value as an error. Thus, the line
was probably meant to be:
Second, usually this kind of function is used if y can only assume
positive values.
If this was intended for this example, the error should be included in
the exp(). Thus,
Third, with these modifications, the following two calls recover the
input parameters nicely:
Unfortunately, as Douglas Bates pointed out, the residual parameter is
not returned in m1a.

Fourth, there appears to be an inconsistency when the two models are
fitted with lmer of lme4.
In m1, parameters are not returned, at least not in the expected
metric, as in m1a.
(MLs, deviances are very similar for m1 and m1a.) In m2, parameters
are returned as in m2a.

I attach an output illustrating the differences..

Reinhold Kliegl
On Sat, Jun 25, 2011 at 6:53 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
-------------- next part --------------

R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[R.app GUI 1.40 (5751) x86_64-apple-darwin9.8.0]
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

    det


Attaching package: 'lme4'

The following object(s) are masked from 'package:stats':

    AIC, BIC
Linear mixed model fit by REML 
Formula: y0 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
     AIC     BIC logLik deviance REMLdev
 -253106 -253072 126557  -253118 -253114
Random effects:
 Groups   Name        Variance   Std.Dev.
 alpha    (Intercept) 0.23916183 0.489042
 beta     (Intercept) 0.01041611 0.102059
 Residual             0.00010043 0.010022
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05397   19.76
Generalized linear mixed model fit by the Laplace approximation 
Formula: y1 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
   AIC   BIC logLik deviance
 908.6 942.9 -450.3    900.6
Random effects:
 Groups   Name        Variance  Std.Dev. 
 alpha    (Intercept) 0.0007268 0.0269592
 beta     (Intercept) 0.0000316 0.0056214
 Residual             0.0030743 0.0554466
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept) 1.066542   0.002977   358.3
Generalized linear mixed model fit by the Laplace approximation 
Formula: y2 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
   AIC   BIC logLik deviance
 958.9 993.2 -475.4    950.9
Random effects:
 Groups   Name        Variance   Std.Dev. 
 alpha    (Intercept) 1.0240e-03 0.0319997
 beta     (Intercept) 4.4498e-05 0.0066707
 Residual             4.3307e-03 0.0658077
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept) 1.066641   0.003533   301.9
Linear mixed model fit by REML 
Formula: log(y2) ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
     AIC     BIC logLik deviance REMLdev
 -253106 -253072 126557  -253118 -253114
Random effects:
 Groups   Name        Variance   Std.Dev.
 alpha    (Intercept) 0.23916709 0.489047
 beta     (Intercept) 0.01041841 0.102071
 Residual             0.00010044 0.010022
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05396   19.76
Loading required package: minqa
Loading required package: Rcpp
Loading required package: MatrixModels

Attaching package: 'lme4a'

The following object(s) are masked from 'package:stats':

    AIC, BIC
Linear mixed model fit by REML ['merMod']
Formula: y0 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
REML criterion at convergence: -253114 

Random effects:
 Groups   Name        Variance  Std.Dev.
 alpha    (Intercept) 0.2391445 0.48902 
 beta     (Intercept) 0.0104174 0.10207 
 Residual             0.0001004 0.01002 
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05397   19.76
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: gaussian 
Formula: y1 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
      AIC       BIC    logLik  deviance 
 906.5606  932.3505 -450.2803  900.5606 

Random effects:
 Groups Name        Variance Std.Dev.
 alpha  (Intercept) 0.23625  0.4861  
 beta   (Intercept) 0.01028  0.1014  
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error z value
(Intercept)  1.06688    0.05368   19.88
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: gaussian 
Formula: y2 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
      AIC       BIC    logLik  deviance 
 956.8503  982.6402 -475.4251  950.8503 

Random effects:
 Groups Name        Variance Std.Dev.
 alpha  (Intercept) 0.23629  0.4861  
 beta   (Intercept) 0.01027  0.1014  
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error z value
(Intercept)  1.06698    0.05368   19.88
Linear mixed model fit by REML ['merMod']
Formula: log(y2) ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
REML criterion at convergence: -253114 

Random effects:
 Groups   Name        Variance  Std.Dev.
 alpha    (Intercept) 0.2391436 0.48902 
 beta     (Intercept) 0.0104175 0.10207 
 Residual             0.0001004 0.01002 
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05397   19.76
R version 2.13.0 (2011-04-13)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MatrixModels_0.2-1 minqa_1.1.15       Rcpp_0.9.4         Matrix_0.999375-50
[5] lattice_0.19-23   

loaded via a namespace (and not attached):
[1] codetools_0.2-8   grid_2.13.0       lme4_0.999375-39  lme4a_0.999375-65
[5] nlme_3.1-100      splines_2.13.0    stats4_2.13.0
#
Dear Reinhold,

Thank you very much for pointing out my typo. Yes, I should use
rnorm(n*m*k,0,sigma.e) instead of rnorm(1,0,sigma.e).

The reason I not included the random error in the exponential function
is that I want to follow the idea of generalized linear model, which
assumes g(E(y))=X\beta+Zb, g is the link function and g=ln() in this
case.

It seems that lme4a is the best one to fit glimmix model while lme4
can not give out a desirable result. Thank all the responders.

Best,

Yue Yu


On Sun, Jun 26, 2011 at 17:12, Reinhold Kliegl
<reinhold.kliegl at gmail.com> wrote: