lme4 vs. HLM (program) - differences in estimation?
On Tue, Nov 18, 2008 at 11:02 AM, Felix Sch?nbrodt <nicebread at gmx.net> wrote:
Hi all,
in my workgroup the HLM program by Raudenbush et al. is the de facto
standard - however, I'd like to do my mixed models in R using the lme4
package (as I do all other computations in R as well...)
Both as a practice and as an argument for my colleagues I tried to replicate
(amongst others) the omnipresent "math-achievement-in-catholic-vs.-public
schools"-example (using the original HLM-data set) in lme4.
My first question is: is my formula in lmer the right one?
--> HLM-style model ( Y = MATHACH; ID = grouping factor)
Level-1 Model
Y = B0 + B1*(SES) + R
Level-2 Model
B0 = G00 + G01*(SECTOR) + G02*(MEANSES) + U0
B1 = G10 + G11*(SECTOR) + G12*(MEANSES) + U1
Combined Model:
Y = G00 + G01*(SECTOR) + G02*(MEANSES) + G11*(SECTOR)*(SES) +
G12*(MEANSES)*(SES) + U0 + U1*(SES) + R
--> lmer-style:
HSB.ML <- lmer(mathach ~ sector + meanses + ses + meanses*ses +
sector*ses + (ses.gmc | id), data=HSB)
That formula is not incorrect but it is a bit redundant. In the R formula notation the ':' operator creates an interaction and the '*' operator is used to cross effects. Thus meanses*ses expands to meanses + ses + meanses:ses
All models yield the same results concerning fixed effects; models including random slopes however show some minor divergence in the random effects variance (not very much, but it is there ...; see sample outputs below). In the presented example, the variance component is 0.10 (lme4) vs. 0.15 (HLM). I am aware that these differences are really small - in this case the difference was only 0.1% of the residual variance.
Does the HLM specification allow for correlation of the random effects within ID group? The lmer specification does.
Nonetheless I would really be happy if someone could shed some light on this issue, so that I can go to my colleagues and say: "We get slightly different results _because_ [...], BUT this is not a problem, _because_ ..." From my web and literature searches I have two guesses about these differences in both programs: - a statement by Douglas Bates, that lme4 uses another estimation algorithm: "[...] but may be of interest to some who are familiar with a generalized least squares (GLS) representation of mixed-models (such as used in MLWin and HLM). The lme4 package uses a penalized least squares (PLS) representation instead." (http://markmail.org/search/?q=lme4+PLS+GLS#query:lme4%20PLS%20GLS+page:1+mid:hb6p6mk6sztkvii2+state:results)
The criterion, either the log-likelihood for ML estimates or the REML criterion for REML estimates, is defined as a property of the data and the model. The issue I was discussing there is how to evaluate the log-likelihood or the REML criterion given the data, the model and values of the parameters. Solving a penalized least squares problem or a generalized least squares problem is just a step in the evaluation. Both paths should give the same answer. The reasons I prefer the penalized least squares approach have to do with accuracy, reliability and speed as well as generality of the approach. I think it will be more important to check that the model specifications are the same and that you are using the same criterion. The default criterion for lmer is REML. I don't know what the default criterion for HLM is.
- HLM maybe does some Bayesian Estimation, what lme4 does not do (?)
I'm not sure what that would mean. To a statistician "Bayesian Estimation" would be associated with Bayesian representations of models in which the parameters are regarded as random variables. The estimation criterion would change from the likelihood (or a related criterion like the REML criterion) to maximizing the "posterior density" of the parameters. Because the posterior density depends on the likelihood and on the prior density you must specify both the probability model and prior densities to be able to define the estimates. At least that is the statistician's view of Bayesian estimation. In some fields, like machine learning, the adjective Bayesian is applied to any algorithm that seems remotely related to a probability model. For example, if you check how movie ratings are calculated at imdb.com they use what they term is a Bayesian algorithm which, in their case, means that they use a weighted average of the actual votes for the movie and a typical vote so that a movie that is highly rated by very few people doesn't suddenly become their number 1 rated movie of all time. Just saying it is Bayesian doesn't define the answer. You need to specify the probability model. I would check two things: does HLM estimate two variances and a covariance for the random effects and is it using the REML criterion or the ML criterion.
But: these only are guesses from my side ... I'm not a statistician, but
would like to understand it (a bit)
All best,
Felix
#-----------------------------------
# OUTPUT FROM HLM
# ------------------------------------
Final estimation of fixed effects:
----------------------------------------------------------------------------
Standard Approx.
Fixed Effect Coefficient Error T-ratio d.f. P-value
----------------------------------------------------------------------------
For INTRCPT1, B0
INTRCPT2, G00 12.096006 0.198734 60.865 157 0.000
SECTOR, G01 1.226384 0.306271 4.004 157 0.000
MEANSES, G02 5.333056 0.369161 14.446 157 0.000
For SES slope, B1
INTRCPT2, G10 2.937980 0.157140 18.697 157 0.000
SECTOR, G11 -1.640951 0.242912 -6.755 157 0.000
MEANSES, G12 1.034418 0.302574 3.419 157 0.001
----------------------------------------------------------------------------
Final estimation of variance components:
-----------------------------------------------------------------------------
Random Effect Standard Variance df Chi-square
P-value
Deviation Component
-----------------------------------------------------------------------------
INTRCPT1, U0 1.54271 2.37996 157 605.29563
0.000
SES slope, U1 0.38603 0.14902 157 162.30883 0.369
level-1, R 6.05831 36.70309
-----------------------------------------------------------------------------
#-----------------------------------
# OUTPUT FROM LMER
#---------------------------------------
Linear mixed model fit by REML
Formula: mathach ~ meanses + sector + ses.gmc + meanses * ses.gmc + sector *
ses.gmc + (ses.gmc | id)
Data: HSB
AIC BIC logLik deviance REMLdev
46524 46592 -23252 46496 46504
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 2.379 1.543
ses.gmc 0.101 0.318 0.391
Residual 36.721 6.060
Number of obs: 7185, groups: id, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.09600 0.19873 60.866
meanses 5.33291 0.36916 14.446
sector 1.22645 0.30627 4.005
ses.gmc 2.93877 0.15509 18.949
meanses:ses.gmc 1.03890 0.29889 3.476
sector:ses.gmc -1.64261 0.23978 -6.850
___________________________________ Dipl. Psych. Felix Sch?nbrodt Department of Psychology Ludwig-Maximilian-Universit?t (LMU) Munich Leopoldstr. 13 D-80802 M?nchen _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models