Felix
I think I can help a bit. I need to know what ses.gmc is. I have
those data and have run the following model:
lmer(mathach ~ meanses * ses + sector * ses + (ses | id), hsb)
My output below shows the data (your and mine) have the same number
of obs. But, the original data distributed with HLM does not have a
variable called ses.gmc. The only variables in the data are:
'data.frame': 7185 obs. of 11 variables:
$ id : Factor w/ 160 levels "1224","1288",..: 1 1 1 1 1 1 1 1 1
1 ...
$ minority: num 0 0 0 0 0 0 0 0 0 0 ...
$ female : num 1 1 0 0 0 0 1 0 1 0 ...
$ ses : num -1.528 -0.588 -0.528 -0.668 -0.158 ...
$ mathach : num 5.88 19.71 20.35 8.78 17.90 ...
$ size : num 842 842 842 842 842 842 842 842 842 842 ...
$ sector : num 0 0 0 0 0 0 0 0 0 0 ...
$ pracad : num 0.35 0.35 0.35 0.35 0.35 0.35 0.35 0.35 0.35 0.35 ...
$ disclim : num 1.60 1.60 1.60 1.60 1.60 ...
$ himinty : num 0 0 0 0 0 0 0 0 0 0 ...
$ meanses : num -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428
-0.428 -0.428 -0.428 ...
Also you should report what version of lme4 you are using. Do a
sessionInfo() and this will appear.
lmer(mathach ~ meanses * ses + sector * ses + (ses | id), hsb)
Linear mixed model fit by REML
Formula: mathach ~ meanses * ses + sector * ses + (ses | id)
Data: hsb
AIC BIC logLik deviance REMLdev
46525 46594 -23253 46498 46505
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 2.40744 1.55159
ses 0.01462 0.12091 1.000
Residual 36.75838 6.06287
Number of obs: 7185, groups: id, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.0948 0.2028 59.65
meanses 3.3202 0.3885 8.55
ses 2.9052 0.1483 19.59
sector 1.1949 0.3079 3.88
meanses:ses 0.8462 0.2718 3.11
ses:sector -1.5781 0.2245 -7.03
Correlation of Fixed Effects:
(Intr) meanss ses sector mnss:s
meanses 0.213
ses 0.076 -0.146
sector -0.676 -0.344 -0.064
meanses:ses -0.143 0.178 0.278 -0.081
ses:sector -0.062 -0.080 -0.679 0.093 -0.356
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf
Of Felix Sch?nbrodt
Sent: Tuesday, November 18, 2008 12:02 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] lme4 vs. HLM (program) - differences in
estimation?
Hi all,
in my workgroup the HLM program by Raudenbush et al. is the
de facto standard - however, I'd like to do my mixed models
in R using the lme4 package (as I do all other computations
in R as well...)
Both as a practice and as an argument for my colleagues I
tried to replicate (amongst others) the omnipresent
"math-achievement-in- catholic-vs.-public schools"-example
(using the original HLM-data set) in lme4.
My first question is: is my formula in lmer the right one?
--> HLM-style model ( Y = MATHACH; ID = grouping factor)
Level-1 Model
Y = B0 + B1*(SES) + R
Level-2 Model
B0 = G00 + G01*(SECTOR) + G02*(MEANSES) + U0
B1 = G10 + G11*(SECTOR) + G12*(MEANSES) + U1
Combined Model:
Y = G00 + G01*(SECTOR) + G02*(MEANSES) + G11*(SECTOR)*(SES) +
G12*(MEANSES)*(SES) + U0 + U1*(SES) + R
--> lmer-style:
HSB.ML <- lmer(mathach ~ sector + meanses + ses +
meanses*ses + sector*ses + (ses.gmc | id), data=HSB)
All models yield the same results concerning fixed effects;
models including random slopes however show some minor
divergence in the random effects variance (not very much, but
it is there ...; see sample outputs below). In the presented
example, the variance component is 0.10 (lme4) vs. 0.15 (HLM).
I am aware that these differences are really small - in this
case the difference was only 0.1% of the residual variance.
Nonetheless I would really be happy if someone could shed
some light on this issue, so that I can go to my colleagues
and say: "We get slightly different results _because_ [...],
BUT this is not a problem, _because_ ..."
From my web and literature searches I have two guesses about
these differences in both programs:
- a statement by Douglas Bates, that lme4 uses another estimation
algorithm:
"[...] but may be of interest to some who are familiar with a
generalized least squares (GLS) representation of
mixed-models (such as used in MLWin and HLM). The lme4
package uses a penalized least squares (PLS) representation instead."
(http://markmail.org/search/?q=lme4+PLS+GLS#query:lme4%20PLS%20GLS
+page:1+mid:hb6p6mk6sztkvii2+state:results)
- HLM maybe does some Bayesian Estimation, what lme4 does not do (?)
But: these only are guesses from my side ... I'm not a
statistician, but would like to understand it (a bit)
All best,
Felix
#-----------------------------------
# OUTPUT FROM HLM
# ------------------------------------
Final estimation of fixed effects:
--------------------------------------------------------------
--------------
Standard Approx.
Fixed Effect Coefficient Error T-ratio
d.f.
P-value
--------------------------------------------------------------
--------------
For INTRCPT1, B0
INTRCPT2, G00 12.096006 0.198734 60.865
157 0.000
SECTOR, G01 1.226384 0.306271 4.004
157 0.000
MEANSES, G02 5.333056 0.369161 14.446
157 0.000
For SES slope, B1
INTRCPT2, G10 2.937980 0.157140 18.697
157 0.000
SECTOR, G11 -1.640951 0.242912 -6.755
157 0.000
MEANSES, G12 1.034418 0.302574 3.419
157 0.001
--------------------------------------------------------------
--------------
Final estimation of variance components:
--------------------------------------------------------------
---------------
Random Effect Standard Variance df
Chi-square
P-value
Deviation Component
--------------------------------------------------------------
---------------
INTRCPT1, U0 1.54271 2.37996 157
605.29563 0.000
SES slope, U1 0.38603 0.14902 157
162.30883 0.369
level-1, R 6.05831 36.70309
--------------------------------------------------------------
---------------
#-----------------------------------
# OUTPUT FROM LMER
#---------------------------------------
Linear mixed model fit by REML
Formula: mathach ~ meanses + sector + ses.gmc + meanses * ses.gmc +
sector * ses.gmc + (ses.gmc | id)
Data: HSB
AIC BIC logLik deviance REMLdev
46524 46592 -23252 46496 46504
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 2.379 1.543
ses.gmc 0.101 0.318 0.391
Residual 36.721 6.060
Number of obs: 7185, groups: id, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.09600 0.19873 60.866
meanses 5.33291 0.36916 14.446
sector 1.22645 0.30627 4.005
ses.gmc 2.93877 0.15509 18.949
meanses:ses.gmc 1.03890 0.29889 3.476
sector:ses.gmc -1.64261 0.23978 -6.850