Skip to content

different aic and LL in glmer(lme4) and glimmix(SAS)?

10 messages · Jeffrey Evans, Andy Fugard, Adam D. I. Kramer +2 more

#
Hello All,
 
I have read several posts related to this previously, but haven't found any
resolution yet. When running the same GLMM in glmer and in SAS PROC GLIMMIX,
both programs return comparable parameter estimates, but wildly different
likelihoods and AIC values.

In SAS I specify use of the Laplace approximation. In R, I believe this is
the default (no?).

What's the difference, and [how] can I reproduce the SAS -2ll in glmer?

Thanks,
Jeff
 
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
(1|ID),data=sdlPCAdat,family="binomial")
Generalized linear mixed model fit by the Laplace approximation 
Formula: cbind(SdlFinal, SdlMax - SdlFinal) ~ lnsdlmaxd * lnadultssdld +
(1 | ID) 
   Data: sdlPCAdat 
  AIC  BIC logLik deviance
 1150 1165   -570     1140        <------------------ this line!!
Random effects:
 Groups Name        Variance Std.Dev.
 ID     (Intercept) 1.2491   1.1176  
Number of obs: 144, groups: ID, 48
 
Fixed effects:
                       Estimate Std. Error z value Pr(>|z|)    
(Intercept)             4.56964    0.43148  10.591  < 2e-16 ***
lnsdlmaxd              -0.65936    0.05686 -11.595  < 2e-16 ***
lnadultssdld           -0.64534    0.15861  -4.069 4.73e-05 ***
lnsdlmaxd:lnadultssdld  0.07393    0.02166   3.414  0.00064 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
Correlation of Fixed Effects:
            (Intr) lnsdlm lndlts
lnsdlmaxd   -0.923              
lnadltssdld -0.461  0.479       
lnsdlmxd:ln  0.482 -0.508 -0.994

 
 \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
title 'SAS GLMM';
proc glimmix data=sdlPCAdat ic=pq noitprint method=laplace;
class site id;
model sdlfinal/sdlmax = lnsdlmaxd|lnadultssdld/ solution dist=binomial;
random ID /;
covtest glm/wald;
run;

//////////////////////////////////////////////////////////////////////

                      SAS GLMM      19:36 Wednesday, June 30, 2010 88

                   The GLIMMIX Procedure

            Data Set           WORK.SDLPCADAT
            Response Variable (Events)  SdlFinal
            Response Variable (Trials)  SdlMax
            Response Distribution     Binomial
            Link Function         Logit
            Variance Function       Default
            Variance Matrix        Not blocked
            Estimation Technique     Maximum Likelihood
            Likelihood Approximation   Laplace
            Degrees of Freedom Method   Containment



                  Optimization Information

             Optimization Technique    Dual Quasi-Newton
             Parameters in Optimization  5
             Lower Boundaries       1
             Upper Boundaries       0
             Fixed Effects         Not Profiled
             Starting From         GLM estimates

             Convergence criterion (GCONV=1E-8) satisfied.

                     Fit Statistics

               -2 Log Likelihood        1653.90  <------------------ this
line!!
               AIC (smaller is better)  1663.90
               AICC (smaller is better) 1664.33
               BIC (smaller is better)  1673.25
               CAIC (smaller is better) 1678.25
               HQIC (smaller is better) 1667.43


              Fit Statistics for Conditional Distribution

              -2 log L(SdlFinal | r. effects)   1436.44
              Pearson Chi-Square          908.07
              Pearson Chi-Square / DF        6.31


                 Covariance Parameter Estimates

            Cov         Standard     Z
            Parm  Estimate    Error   Value   Pr > Z

            ID    1.2491   0.2746   4.55   <.0001


                 Solutions for Fixed Effects

                              Standard
     Effect         Estimate    Error    DF  t Value  Pr > |t|

     Intercept           4.5696   0.4333    47    10.55  <.0001
     lnsdlmaxd          -0.6594   0.05717   93   -11.53  <.0001
     lnadultssdld       -0.6453   0.1593    93   -4.05   0.0001
     lnsdlmaxd*lnadultssd0.07394  0.02174   93    3.40   0.0010
#
On Thu, Jul 1, 2010 at 11:03 AM, Jeffrey Evans
<Jeffrey.Evans at dartmouth.edu> wrote:
The difference is probably due to the way that the deviance is defined
for the binomial family in R.  A glm family object is a list of
functions and expressions.  One of the functions, called "dev.resids"
has arguments y, mu and weights.  You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns
as you did here.  When you use the two column specification the
response y is transformed to the fraction of successes and the number
of cases is incorporated in the weights.  It turns out that this is
all the information necessary for obtaining the mle's of the
parameters but it does not give the same deviance as you would get by
listing the 0/1 responses.

I'll write an example using the cbpp data from the lme4 package.
#
Good question.
 
They are similar
 
Compare models with nested fixed effects structures
Full model = lnsdlmaxd + lnadultssdld + lnsdlmaxd:lnadultssdld
Reduced model = lnsdlmaxd + lnadultssdld

AIC		R		SAS
Full		1150		1663.9
Reduced	1159		1673.4
-------------------------------
deltaAIC	9		9.5


________________________________

From: almost at gmail.com [mailto:almost at gmail.com] On Behalf Of Andy Fugard
Sent: Thursday, July 01, 2010 12:16 PM
To: Jeffrey Evans
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] different aic and LL in glmer(lme4) and
glimmix(SAS)?


Hi Jeff, 

Can't answer the question, but out of interest, what happens when you
compare nested models in R and SAS, e.g., models with and without the
lnsdlmaxd:lnadultssdld interaction?  Would be interesting to see the
log-likehood ratio (and maybe /change/ in AIC and BIC between the models).

Cheers,

Andy


On Thu, Jul 1, 2010 at 18:03, Jeffrey Evans <Jeffrey.Evans at dartmouth.edu>
wrote:


	Hello All,
	
	I have read several posts related to this previously, but haven't
found any
	resolution yet. When running the same GLMM in glmer and in SAS PROC
GLIMMIX,
	both programs return comparable parameter estimates, but wildly
different
	likelihoods and AIC values.
	
	In SAS I specify use of the Laplace approximation. In R, I believe
this is
	the default (no?).
	
	What's the difference, and [how] can I reproduce the SAS -2ll in
glmer?
	
	Thanks,
	Jeff
	
	\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
	> R_GLMM = glmer(cbind(SdlFinal, SdlMax-SdlFinal) ~
lnsdlmaxd*lnadultssdld +
	 (1|ID),data=sdlPCAdat,family="binomial")
	> R_GLMM
	Generalized linear mixed model fit by the Laplace approximation
	Formula: cbind(SdlFinal, SdlMax - SdlFinal) ~ lnsdlmaxd *
lnadultssdld +
	(1 | ID)
	  Data: sdlPCAdat
	 AIC  BIC logLik deviance
	 1150 1165   -570     1140        <------------------ this line!!
	Random effects:
	 Groups Name        Variance Std.Dev.
	 ID     (Intercept) 1.2491   1.1176
	Number of obs: 144, groups: ID, 48
	
	Fixed effects:
	                      Estimate Std. Error z value Pr(>|z|)
	(Intercept)             4.56964    0.43148  10.591  < 2e-16 ***
	lnsdlmaxd              -0.65936    0.05686 -11.595  < 2e-16 ***
	lnadultssdld           -0.64534    0.15861  -4.069 4.73e-05 ***
	lnsdlmaxd:lnadultssdld  0.07393    0.02166   3.414  0.00064 ***
	---
	Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
	
	Correlation of Fixed Effects:
	           (Intr) lnsdlm lndlts
	lnsdlmaxd   -0.923
	lnadltssdld -0.461  0.479
	lnsdlmxd:ln  0.482 -0.508 -0.994
	
	
	 \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
	title 'SAS GLMM';
	proc glimmix data=sdlPCAdat ic=pq noitprint method=laplace;
	class site id;
	model sdlfinal/sdlmax = lnsdlmaxd|lnadultssdld/ solution
dist=binomial;
	random ID /;
	covtest glm/wald;
	run;
	
	
//////////////////////////////////////////////////////////////////////
	
	                     SAS GLMM      19:36 Wednesday, June 30, 2010 88
	
	                  The GLIMMIX Procedure
	
	           Data Set           WORK.SDLPCADAT
	           Response Variable (Events)  SdlFinal
	           Response Variable (Trials)  SdlMax
	           Response Distribution     Binomial
	           Link Function         Logit
	           Variance Function       Default
	           Variance Matrix        Not blocked
	           Estimation Technique     Maximum Likelihood
	           Likelihood Approximation   Laplace
	           Degrees of Freedom Method   Containment
	
	
	
	                 Optimization Information
	
	            Optimization Technique    Dual Quasi-Newton
	            Parameters in Optimization  5
	            Lower Boundaries       1
	            Upper Boundaries       0
	            Fixed Effects         Not Profiled
	            Starting From         GLM estimates
	
	            Convergence criterion (GCONV=1E-8) satisfied.
	
	                    Fit Statistics
	
	              -2 Log Likelihood        1653.90  <------------------
this
	line!!
	              AIC (smaller is better)  1663.90
	              AICC (smaller is better) 1664.33
	              BIC (smaller is better)  1673.25
	              CAIC (smaller is better) 1678.25
	              HQIC (smaller is better) 1667.43
	
	
	             Fit Statistics for Conditional Distribution
	
	             -2 log L(SdlFinal | r. effects)   1436.44
	             Pearson Chi-Square          908.07
	             Pearson Chi-Square / DF        6.31
	
	
	                Covariance Parameter Estimates
	
	           Cov         Standard     Z
	           Parm  Estimate    Error   Value   Pr > Z
	
	           ID    1.2491   0.2746   4.55   <.0001
	
	
	                Solutions for Fixed Effects
	
	                             Standard
	    Effect         Estimate    Error    DF  t Value  Pr > |t|
	
	    Intercept           4.5696   0.4333    47    10.55  <.0001
	    lnsdlmaxd          -0.6594   0.05717   93   -11.53  <.0001
	    lnadultssdld       -0.6453   0.1593    93   -4.05   0.0001
	    lnsdlmaxd*lnadultssd0.07394  0.02174   93    3.40   0.0010
	
	_______________________________________________
	R-sig-mixed-models at r-project.org mailing list
	https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
I see. Thank you for the clarification.  


I did just try lme4 on a[n expanded] binary version of the same data, but
the numbers are still not coming out the same as in SAS.

No matter. The deltaAIC values are the same, so I am content that they are
doing similar things.

Cheers,
Jeff


-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
Bates
Sent: Thursday, July 01, 2010 12:24 PM
To: Jeffrey Evans
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] different aic and LL in glmer(lme4) and
glimmix(SAS)?

On Thu, Jul 1, 2010 at 11:03 AM, Jeffrey Evans <Jeffrey.Evans at dartmouth.edu>
wrote:
The difference is probably due to the way that the deviance is defined for
the binomial family in R.  A glm family object is a list of functions and
expressions.  One of the functions, called "dev.resids"
has arguments y, mu and weights.  You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns as you
did here.  When you use the two column specification the response y is
transformed to the fraction of successes and the number of cases is
incorporated in the weights.  It turns out that this is all the information
necessary for obtaining the mle's of the parameters but it does not give the
same deviance as you would get by listing the 0/1 responses.

I'll write an example using the cbpp data from the lme4 package.
#
Also...R is giving you a model with LESS deviance. So, it's doing a better
job...why would you want to reproduce SAS? :)

--Adam
On Thu, 1 Jul 2010, Jeffrey Evans wrote:

            
#
On Thu, Jul 1, 2010 at 11:24 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
I enclose the example I promised.
-------------- next part --------------

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

    det

Loading required package: minqa
Loading required package: Rcpp

Attaching package: 'lme4a'

The following object(s) are masked from 'package:stats':

    AIC
+     stopifnot(is(fr, "data.frame"),
+               is(mm <- fr[[1]], "matrix"),
+               ncol(mm) == 2,
+               is.numeric(mm),
+               all(mm >= 0))
+     nr <- nrow(mm)
+     within(fr[, -1][rep.int(seq_len(nr), rowSums(mm)), ],
+            y. <- rep.int(rep.int(c(1,0), nr), as.vector(t(mm))))
+ }
'data.frame':	56 obs. of  4 variables:
 $ herd     : Factor w/ 15 levels "1","2","3","4",..: 1 1 1 1 2 2 2 3 3 3 ...
 $ incidence: num  2 3 4 0 3 1 1 8 2 0 ...
 $ size     : num  14 12 9 5 22 18 21 22 16 16 ...
 $ period   : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 1 2 3 ...
+               (1|herd), cbpp, binomial, verbose=1L))
npt = 3 , n =  1 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:   4:      100.209;0.600000 
  0.0020:   7:      100.154;0.649390 
 0.00020:  10:      100.152;0.641932 
 2.0e-05:  12:      100.152;0.641823 
 2.0e-06:  13:      100.152;0.641823 
 2.0e-07:  15:      100.152;0.641815 
At return
 18:     100.15189: 0.641815
npt = 11 , n =  5 
rhobeg =  0.5840305 , rhoend =  5.840305e-07 
   0.058:  11:      100.152;0.641815 -1.36047 -2.33665 -2.47155 -2.92015 
  0.0058:  18:      100.108;0.653041 -1.38121 -2.38874 -2.52675 -2.97967 
 0.00058:  31:      100.095;0.642982 -1.39633 -2.38831 -2.52555 -2.97656 
 5.8e-05:  48:      100.095;0.642327 -1.39893 -2.39134 -2.52748 -2.97945 
 5.8e-06:  57:      100.095;0.642392 -1.39894 -2.39126 -2.52758 -2.97924 
 5.8e-07:  66:      100.095;0.642393 -1.39894 -2.39125 -2.52758 -2.97924 
At return
 78:     100.09497: 0.642392 -1.39894 -2.39125 -2.52758 -2.97924
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: cbind(incidence, size - incidence) ~ 0 + period + (1 | herd) 
   Data: cbpp 
     AIC      BIC   logLik deviance 
110.0950 120.2217 -50.0475 100.0950 

Random effects:
 Groups Name        Variance Std.Dev.
 herd   (Intercept) 0.4127   0.6424  
Number of obs: 56, groups: herd, 15

Fixed effects:
        Estimate Std. Error z value
period1  -1.3989     0.2279  -6.138
period2  -2.3912     0.3103  -7.705
period3  -2.5276     0.3308  -7.641
period4  -2.9792     0.4327  -6.885

Correlation of Fixed Effects:
        perid1 perid2 perid3
period2 0.389               
period3 0.365  0.289        
period4 0.280  0.220  0.205
'data.frame':	842 obs. of  3 variables:
 $ period: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
 $ herd  : Factor w/ 15 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ y.    : num  1 1 0 0 0 0 0 0 0 0 ...
+                verbose=1L))
npt = 3 , n =  1 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:   4:      555.118;0.600000 
  0.0020:   7:      555.062;0.649390 
 0.00020:  10:      555.060;0.641932 
 2.0e-05:  12:      555.060;0.641823 
 2.0e-06:  13:      555.060;0.641823 
 2.0e-07:  15:      555.060;0.641815 
At return
 17:     555.05991: 0.641815
npt = 11 , n =  5 
rhobeg =  0.5840305 , rhoend =  5.840305e-07 
   0.058:  11:      555.060;0.641815 -1.36047 -2.33665 -2.47155 -2.92015 
  0.0058:  18:      555.016;0.653041 -1.38121 -2.38874 -2.52675 -2.97967 
 0.00058:  31:      555.003;0.642982 -1.39633 -2.38831 -2.52555 -2.97656 
 5.8e-05:  48:      555.003;0.642327 -1.39893 -2.39134 -2.52748 -2.97945 
 5.8e-06:  57:      555.003;0.642392 -1.39894 -2.39126 -2.52758 -2.97924 
 5.8e-07:  66:      555.003;0.642393 -1.39894 -2.39125 -2.52758 -2.97924 
At return
 76:     555.00300: 0.642392 -1.39894 -2.39125 -2.52758 -2.97924
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: y. ~ 0 + period + (1 | herd) 
   Data: cbpp1 
      AIC       BIC    logLik  deviance 
 565.0030  588.6819 -277.5015  555.0030 

Random effects:
 Groups Name        Variance Std.Dev.
 herd   (Intercept) 0.4127   0.6424  
Number of obs: 842, groups: herd, 15

Fixed effects:
        Estimate Std. Error z value
period1  -1.3989     0.2279  -6.138
period2  -2.3912     0.3103  -7.705
period3  -2.5276     0.3308  -7.641
period4  -2.9792     0.4327  -6.885

Correlation of Fixed Effects:
        perid1 perid2 perid3
period2 0.389               
period3 0.365  0.289        
period4 0.280  0.220  0.205
user  system elapsed 
  6.140   0.130   6.405
#
Isn't it also possible the difference is because (e.g.) lme4 drops
constants out of the likelihood and SAS doesn't?

Hadley
#
On Thu, Jul 1, 2010 at 2:54 PM, Hadley Wickham <hadley at rice.edu> wrote:
Certainly a possibility.  I sometimes amaze myself with how sloppy I
can be in derivations. :-)