Hello All,
I have read several posts related to this previously, but haven't found any
resolution yet. When running the same GLMM in glmer and in SAS PROC GLIMMIX,
both programs return comparable parameter estimates, but wildly different
likelihoods and AIC values.
In SAS I specify use of the Laplace approximation. In R, I believe this is
the default (no?).
What's the difference, and [how] can I reproduce the SAS -2ll in glmer?
Thanks,
Jeff
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
I have read several posts related to this previously, but haven't found any
resolution yet. When running the same GLMM in glmer and in SAS PROC GLIMMIX,
both programs return comparable parameter estimates, but wildly different
likelihoods and AIC values.
In SAS I specify use of the Laplace approximation. In R, I believe this is
the default (no?).
What's the difference, and [how] can I reproduce the SAS -2ll in glmer?
The difference is probably due to the way that the deviance is defined
for the binomial family in R. A glm family object is a list of
functions and expressions. One of the functions, called "dev.resids"
has arguments y, mu and weights. You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns
as you did here. When you use the two column specification the
response y is transformed to the fraction of successes and the number
of cases is incorporated in the weights. It turns out that this is
all the information necessary for obtaining the mle's of the
parameters but it does not give the same deviance as you would get by
listing the 0/1 responses.
I'll write an example using the cbpp data from the lme4 package.
Thanks,
Jeff
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Good question.
They are similar
Compare models with nested fixed effects structures
Full model = lnsdlmaxd + lnadultssdld + lnsdlmaxd:lnadultssdld
Reduced model = lnsdlmaxd + lnadultssdld
AIC R SAS
Full 1150 1663.9
Reduced 1159 1673.4
-------------------------------
deltaAIC 9 9.5
________________________________
From: almost at gmail.com [mailto:almost at gmail.com] On Behalf Of Andy Fugard
Sent: Thursday, July 01, 2010 12:16 PM
To: Jeffrey Evans
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] different aic and LL in glmer(lme4) and
glimmix(SAS)?
Hi Jeff,
Can't answer the question, but out of interest, what happens when you
compare nested models in R and SAS, e.g., models with and without the
lnsdlmaxd:lnadultssdld interaction? Would be interesting to see the
log-likehood ratio (and maybe /change/ in AIC and BIC between the models).
Cheers,
Andy
On Thu, Jul 1, 2010 at 18:03, Jeffrey Evans <Jeffrey.Evans at dartmouth.edu>
wrote:
Hello All,
I have read several posts related to this previously, but haven't
found any
resolution yet. When running the same GLMM in glmer and in SAS PROC
GLIMMIX,
both programs return comparable parameter estimates, but wildly
different
likelihoods and AIC values.
In SAS I specify use of the Laplace approximation. In R, I believe
this is
the default (no?).
What's the difference, and [how] can I reproduce the SAS -2ll in
glmer?
Thanks,
Jeff
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
> R_GLMM = glmer(cbind(SdlFinal, SdlMax-SdlFinal) ~
lnsdlmaxd*lnadultssdld +
(1|ID),data=sdlPCAdat,family="binomial")
> R_GLMM
Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(SdlFinal, SdlMax - SdlFinal) ~ lnsdlmaxd *
lnadultssdld +
(1 | ID)
Data: sdlPCAdat
AIC BIC logLik deviance
1150 1165 -570 1140 <------------------ this line!!
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 1.2491 1.1176
Number of obs: 144, groups: ID, 48
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.56964 0.43148 10.591 < 2e-16 ***
lnsdlmaxd -0.65936 0.05686 -11.595 < 2e-16 ***
lnadultssdld -0.64534 0.15861 -4.069 4.73e-05 ***
lnsdlmaxd:lnadultssdld 0.07393 0.02166 3.414 0.00064 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) lnsdlm lndlts
lnsdlmaxd -0.923
lnadltssdld -0.461 0.479
lnsdlmxd:ln 0.482 -0.508 -0.994
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
title 'SAS GLMM';
proc glimmix data=sdlPCAdat ic=pq noitprint method=laplace;
class site id;
model sdlfinal/sdlmax = lnsdlmaxd|lnadultssdld/ solution
dist=binomial;
random ID /;
covtest glm/wald;
run;
//////////////////////////////////////////////////////////////////////
SAS GLMM 19:36 Wednesday, June 30, 2010 88
The GLIMMIX Procedure
Data Set WORK.SDLPCADAT
Response Variable (Events) SdlFinal
Response Variable (Trials) SdlMax
Response Distribution Binomial
Link Function Logit
Variance Function Default
Variance Matrix Not blocked
Estimation Technique Maximum Likelihood
Likelihood Approximation Laplace
Degrees of Freedom Method Containment
Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 5
Lower Boundaries 1
Upper Boundaries 0
Fixed Effects Not Profiled
Starting From GLM estimates
Convergence criterion (GCONV=1E-8) satisfied.
Fit Statistics
-2 Log Likelihood 1653.90 <------------------
this
line!!
AIC (smaller is better) 1663.90
AICC (smaller is better) 1664.33
BIC (smaller is better) 1673.25
CAIC (smaller is better) 1678.25
HQIC (smaller is better) 1667.43
Fit Statistics for Conditional Distribution
-2 log L(SdlFinal | r. effects) 1436.44
Pearson Chi-Square 908.07
Pearson Chi-Square / DF 6.31
Covariance Parameter Estimates
Cov Standard Z
Parm Estimate Error Value Pr > Z
ID 1.2491 0.2746 4.55 <.0001
Solutions for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr > |t|
Intercept 4.5696 0.4333 47 10.55 <.0001
lnsdlmaxd -0.6594 0.05717 93 -11.53 <.0001
lnadultssdld -0.6453 0.1593 93 -4.05 0.0001
lnsdlmaxd*lnadultssd0.07394 0.02174 93 3.40 0.0010
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I see. Thank you for the clarification.
I did just try lme4 on a[n expanded] binary version of the same data, but
the numbers are still not coming out the same as in SAS.
No matter. The deltaAIC values are the same, so I am content that they are
doing similar things.
Cheers,
Jeff
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
Bates
Sent: Thursday, July 01, 2010 12:24 PM
To: Jeffrey Evans
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] different aic and LL in glmer(lme4) and
glimmix(SAS)?
On Thu, Jul 1, 2010 at 11:03 AM, Jeffrey Evans <Jeffrey.Evans at dartmouth.edu>
wrote:
Hello All,
I have read several posts related to this previously, but haven't
found any resolution yet. When running the same GLMM in glmer and in
SAS PROC GLIMMIX, both programs return comparable parameter estimates,
but wildly different likelihoods and AIC values.
In SAS I specify use of the Laplace approximation. In R, I believe
this is the default (no?).
What's the difference, and [how] can I reproduce the SAS -2ll in glmer?
The difference is probably due to the way that the deviance is defined for
the binomial family in R. A glm family object is a list of functions and
expressions. One of the functions, called "dev.resids"
has arguments y, mu and weights. You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns as you
did here. When you use the two column specification the response y is
transformed to the fraction of successes and the number of cases is
incorporated in the weights. It turns out that this is all the information
necessary for obtaining the mle's of the parameters but it does not give the
same deviance as you would get by listing the 0/1 responses.
I'll write an example using the cbpp data from the lme4 package.
Thanks,
Jeff
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Also...R is giving you a model with LESS deviance. So, it's doing a better
job...why would you want to reproduce SAS? :)
--Adam
On Thu, 1 Jul 2010, Jeffrey Evans wrote:
Good question.
They are similar
Compare models with nested fixed effects structures
Full model = lnsdlmaxd + lnadultssdld + lnsdlmaxd:lnadultssdld
Reduced model = lnsdlmaxd + lnadultssdld
AIC R SAS
Full 1150 1663.9
Reduced 1159 1673.4
-------------------------------
deltaAIC 9 9.5
________________________________
From: almost at gmail.com [mailto:almost at gmail.com] On Behalf Of Andy Fugard
Sent: Thursday, July 01, 2010 12:16 PM
To: Jeffrey Evans
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] different aic and LL in glmer(lme4) and
glimmix(SAS)?
Hi Jeff,
Can't answer the question, but out of interest, what happens when you
compare nested models in R and SAS, e.g., models with and without the
lnsdlmaxd:lnadultssdld interaction? Would be interesting to see the
log-likehood ratio (and maybe /change/ in AIC and BIC between the models).
Cheers,
Andy
On Thu, Jul 1, 2010 at 18:03, Jeffrey Evans <Jeffrey.Evans at dartmouth.edu>
wrote:
Hello All,
I have read several posts related to this previously, but haven't
found any
resolution yet. When running the same GLMM in glmer and in SAS PROC
GLIMMIX,
both programs return comparable parameter estimates, but wildly
different
likelihoods and AIC values.
In SAS I specify use of the Laplace approximation. In R, I believe
this is
the default (no?).
What's the difference, and [how] can I reproduce the SAS -2ll in
glmer?
Thanks,
Jeff
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
> R_GLMM = glmer(cbind(SdlFinal, SdlMax-SdlFinal) ~
lnsdlmaxd*lnadultssdld +
(1|ID),data=sdlPCAdat,family="binomial")
> R_GLMM
Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(SdlFinal, SdlMax - SdlFinal) ~ lnsdlmaxd *
lnadultssdld +
(1 | ID)
Data: sdlPCAdat
AIC BIC logLik deviance
1150 1165 -570 1140 <------------------ this line!!
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 1.2491 1.1176
Number of obs: 144, groups: ID, 48
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.56964 0.43148 10.591 < 2e-16 ***
lnsdlmaxd -0.65936 0.05686 -11.595 < 2e-16 ***
lnadultssdld -0.64534 0.15861 -4.069 4.73e-05 ***
lnsdlmaxd:lnadultssdld 0.07393 0.02166 3.414 0.00064 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) lnsdlm lndlts
lnsdlmaxd -0.923
lnadltssdld -0.461 0.479
lnsdlmxd:ln 0.482 -0.508 -0.994
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
title 'SAS GLMM';
proc glimmix data=sdlPCAdat ic=pq noitprint method=laplace;
class site id;
model sdlfinal/sdlmax = lnsdlmaxd|lnadultssdld/ solution
dist=binomial;
random ID /;
covtest glm/wald;
run;
//////////////////////////////////////////////////////////////////////
SAS GLMM 19:36 Wednesday, June 30, 2010 88
The GLIMMIX Procedure
Data Set WORK.SDLPCADAT
Response Variable (Events) SdlFinal
Response Variable (Trials) SdlMax
Response Distribution Binomial
Link Function Logit
Variance Function Default
Variance Matrix Not blocked
Estimation Technique Maximum Likelihood
Likelihood Approximation Laplace
Degrees of Freedom Method Containment
Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 5
Lower Boundaries 1
Upper Boundaries 0
Fixed Effects Not Profiled
Starting From GLM estimates
Convergence criterion (GCONV=1E-8) satisfied.
Fit Statistics
-2 Log Likelihood 1653.90 <------------------
this
line!!
AIC (smaller is better) 1663.90
AICC (smaller is better) 1664.33
BIC (smaller is better) 1673.25
CAIC (smaller is better) 1678.25
HQIC (smaller is better) 1667.43
Fit Statistics for Conditional Distribution
-2 log L(SdlFinal | r. effects) 1436.44
Pearson Chi-Square 908.07
Pearson Chi-Square / DF 6.31
Covariance Parameter Estimates
Cov Standard Z
Parm Estimate Error Value Pr > Z
ID 1.2491 0.2746 4.55 <.0001
Solutions for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr > |t|
Intercept 4.5696 0.4333 47 10.55 <.0001
lnsdlmaxd -0.6594 0.05717 93 -11.53 <.0001
lnadultssdld -0.6453 0.1593 93 -4.05 0.0001
lnsdlmaxd*lnadultssd0.07394 0.02174 93 3.40 0.0010
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On Thu, Jul 1, 2010 at 11:24 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
On Thu, Jul 1, 2010 at 11:03 AM, Jeffrey Evans
<Jeffrey.Evans at dartmouth.edu> wrote:
Hello All,
I have read several posts related to this previously, but haven't found any
resolution yet. When running the same GLMM in glmer and in SAS PROC GLIMMIX,
both programs return comparable parameter estimates, but wildly different
likelihoods and AIC values.
In SAS I specify use of the Laplace approximation. In R, I believe this is
the default (no?).
What's the difference, and [how] can I reproduce the SAS -2ll in glmer?
The difference is probably due to the way that the deviance is defined
for the binomial family in R. ?A glm family object is a list of
functions and expressions. ?One of the functions, called "dev.resids"
has arguments y, mu and weights. ?You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns
as you did here. ?When you use the two column specification the
response y is transformed to the fraction of successes and the number
of cases is incorporated in the weights. ?It turns out that this is
all the information necessary for obtaining the mle's of the
parameters but it does not give the same deviance as you would get by
listing the 0/1 responses.
I'll write an example using the cbpp data from the lme4 package.
I enclose the example I promised.
Thanks,
Jeff
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
-------------- next part --------------
R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
library(lme4a)
Loading required package: Matrix
Loading required package: lattice
Attaching package: 'Matrix'
The following object(s) are masked from 'package:base':
det
Loading required package: minqa
Loading required package: Rcpp
Attaching package: 'lme4a'
The following object(s) are masked from 'package:stats':
AIC
##' <description>
##' Expand the data frame for the matrix form of the binomial response
##' to the vector form of the response.
##' <details>
##' In the formula for the glm and glmer functions with family =
##' binomial the response can be specified as a matrix with two
##' columns, corresponding to numbers of successes and failures. The
##' parameter estimates for the model will be the same as those from
##' the vector response but the deviance itself is a different value.
##'
##' This function expands the model frame for the matrix response to
##' the equivalent frame for the vector response, replacing the
##' response matrix with a vector called y. of 1's and 0's
##' @title Expand binomial matrix response frame
##' @param fr model frame from a glm or glmer fitted model with
##' family=binomial and the matrix response specification.
##' @return expanded model frame with the response matrix replaced by
##' a vector of 1's and 0's.
##' @author Douglas Bates
expandsu <- function(fr) {
The difference is probably due to the way that the deviance is defined
for the binomial family in R. ?A glm family object is a list of
functions and expressions. ?One of the functions, called "dev.resids"
has arguments y, mu and weights. ?You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns
as you did here. ?When you use the two column specification the
response y is transformed to the fraction of successes and the number
of cases is incorporated in the weights. ?It turns out that this is
all the information necessary for obtaining the mle's of the
parameters but it does not give the same deviance as you would get by
listing the 0/1 responses.
Isn't it also possible the difference is because (e.g.) lme4 drops
constants out of the likelihood and SAS doesn't?
Hadley
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
On Thu, Jul 1, 2010 at 2:54 PM, Hadley Wickham <hadley at rice.edu> wrote:
The difference is probably due to the way that the deviance is defined
for the binomial family in R. ?A glm family object is a list of
functions and expressions. ?One of the functions, called "dev.resids"
has arguments y, mu and weights. ?You can specify the response for a
binomial family as the 0/1 responses or as a matrix with two columns
as you did here. ?When you use the two column specification the
response y is transformed to the fraction of successes and the number
of cases is incorporated in the weights. ?It turns out that this is
all the information necessary for obtaining the mle's of the
parameters but it does not give the same deviance as you would get by
listing the 0/1 responses.
Isn't it also possible the difference is because (e.g.) lme4 drops
constants out of the likelihood and SAS doesn't?
Certainly a possibility. I sometimes amaze myself with how sloppy I
can be in derivations. :-)