Hello, I have run two analyses, each with the same data set and
predictors. One is a nested model run with lme; the other is a
cross-classified model with lmer. The only difference between the two
models is the added random effect. For example, the nested model
statement looks like this:
nested.lm3 <- lme(final.points ~ -1 + gr10 + gr11 + gr12 + per1 + per2 +
per4 + per5 + per6 + per7 + per8 + per9 + per10 + per11 + per12 +
cblackd + casiand + clatinod + cmale + cssoc + cscon +
cold4gr + cmlatent8 + computer +
...
jourlsm,
data=all.subj, random = ~ 1|sid, na.action=na.omit)
The cross-classified model looks like this:
lm4c <- lmer(final.points ~ -1 + gr10 + gr11 + gr12 + per1 + per2 + per4
+ per5 + per6 + per7 + per8 + per9 + per10 + per11 + per12 +
cblackd + casiand + clatinod + cmale + cssoc + cscon +
cold4gr + cmlatent8 + computer +
...
jourlsm +
( 1 | sid) + (1 | tid), data=all.subj, REML=F, verbose=T)
The variance components for the nested model are:
Random effects:
Formula: ~1 | sid
(Intercept) Residual
StdDev: 0.8826577 0.9259174
for the cross-classified model:
Groups Name Variance Std.Dev.
sid (Intercept) 0.75426 0.86848
tid (Intercept) 0.39601 0.62929
Residual 0.68535 0.82786
If we square and sum the variance components for the nested model, the
total variance is about 1.64. For the cross-classified model, the total
variance is about 1.84. Where did the additional variance come from?
Should I just interpret the size of the variance components on a
relative scale, are the units different, or what?
Stuart Luppescu -*-*- slu <at> ccsr <dot> uchicago <dot> edu
CCSR in UEI at U of C
Dear Stuart,
I think that the "extra" variance is substracted from the fixed effects.
Which indicates that some of the information in your fixed effects was
due to the levels of tid.
But to make a fair comparison you should run both models with lmer. And
then compare both the random effects variances and the fixed effect
estimates.
HTH,
Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium
Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] Namens
Stuart Luppescu
Verzonden: dinsdag 9 maart 2010 1:21
Aan: r-sig-mixed-models at r-project.org
Onderwerp: [R-sig-ME] Size/metric of variance components in
lme and lmer
Hello, I have run two analyses, each with the same data set
and predictors. One is a nested model run with lme; the other
is a cross-classified model with lmer. The only difference
between the two models is the added random effect. For
example, the nested model statement looks like this:
nested.lm3 <- lme(final.points ~ -1 + gr10 + gr11 + gr12 +
per1 + per2 +
per4 + per5 + per6 + per7 + per8 + per9 + per10 + per11 + per12 +
cblackd + casiand + clatinod + cmale +
cssoc + cscon + cold4gr + cmlatent8 + computer +
...
jourlsm,
data=all.subj, random = ~ 1|sid, na.action=na.omit)
The cross-classified model looks like this:
lm4c <- lmer(final.points ~ -1 + gr10 + gr11 + gr12 + per1 +
per2 + per4
+ per5 + per6 + per7 + per8 + per9 + per10 + per11 + per12 +
cblackd + casiand + clatinod + cmale +
cssoc + cscon + cold4gr + cmlatent8 + computer +
...
jourlsm +
( 1 | sid) + (1 | tid), data=all.subj, REML=F, verbose=T)
The variance components for the nested model are:
Random effects:
Formula: ~1 | sid
(Intercept) Residual
StdDev: 0.8826577 0.9259174
for the cross-classified model:
Groups Name Variance Std.Dev.
sid (Intercept) 0.75426 0.86848
tid (Intercept) 0.39601 0.62929
Residual 0.68535 0.82786
If we square and sum the variance components for the nested
model, the total variance is about 1.64. For the
cross-classified model, the total variance is about 1.84.
Where did the additional variance come from?
Should I just interpret the size of the variance components
on a relative scale, are the units different, or what?
--
Stuart Luppescu -*-*- slu <at> ccsr <dot> uchicago <dot> edu
CCSR in UEI at U of C
Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in this message
and any annex are purely those of the writer and may not be regarded as stating
an official position of INBO, as long as the message is not confirmed by a duly
signed document.
On Tue, 2010-03-09 at 10:16 +0100, ONKELINX, Thierry wrote:
Dear Stuart,
I think that the "extra" variance is substracted from the fixed
effects. Which indicates that some of the information in your fixed
effects was due to the levels of tid.
But to make a fair comparison you should run both models with lmer.
And then compare both the random effects variances and the fixed
effect estimates.
OK. I copied function call from the cross-classified model with lmer and
removed the second random effect and reran it. Here are the random
effects tables:
Nested model:
Data: all.subj
AIC BIC logLik deviance REMLdev
3448318 3449229 -1724083 3448166 3448707
Random effects:
Groups Name Variance Std.Dev.
sid (Intercept) 0.77403 0.87979
Residual 0.85746 0.92599
Number of obs: 1185094, groups: sid, 122897
Total variance: 1.63
--------------------------------------------------
Cross-classified model:
Data: all.subj
AIC BIC logLik deviance REMLdev
3236856 3237779 -1618351 3236702 3237217
Random effects:
Groups Name Variance Std.Dev.
sid (Intercept) 0.75066 0.86641
tid (Intercept) 0.39171 0.62587
Residual 0.68583 0.82815
Number of obs: 1185094, groups: sid, 122897; tid, 8939
Total variance: 1.83
Could some of the difference be due to covariance between the random
effects?
Stuart Luppescu <slu at ccsr.uchicago.edu>
University of Chicago
Stuart:
Not sure I know the answers, but a few thoughts. What covariance? In your non-nested model I don't think the random effects are assumed correlated are they? Doug Bates will certainly not be happy with what I do next, but it's one way to think about this.
An algebraic (not computational) expression for the variance of a mixed-effects model is var(y) = V = ZDZ' + sigma^2I
Where Z is the model matrix for the random effects, D is the variance/covariance of the random effects, sigma^2 is the residual variance and I is the identity matrix.
In your nested model, Z has a rather simple structure and D is a scalar. In you non-nested model, Z has a more complex structure denoting the linkage of students to multiple teachers and D is diagonal with I think D = [D_1, D_2] where D_1 = diag(0.75066, ..., 0.75066) and D_2 = diag(0.39171, ..., 0.39171).
In the first nested case, V will be block diagonal denoting no covariance between students in other teacher's classrooms. There will only be covariances between students in the same classroom. The portion ZDZ' will be pretty simple with the block diagonal elements all equal to the scalar variance.
In the non-nested case, Z will have no simple structure. It will be a horizontal concatenation of Z= [Z_1, Z_2] where Z_1 is the model matrix of the random effects for the students and Z_2 is the model matrix of the random effects for the teachers. I think in Z_1, there will be covariances between students who shared the same teacher, but it will not be bllock-diagonal. Let me think just about Z_1 and the corresponding part of D_1 for just a moment. The diagonal elements of the matrix V formed from Z_1 and D_1 will be larger in some places reflecting the multiple students that shared that teacher.
I *think* in some hand calcs I am looking at on my notes here, that would increase the var(y) to some extent and maybe why you see the larger variance in the non-nested case.
In any event, I am recommending that you maybe do these same hand calcs on your own and break this down to see why the variances would be different. I have done this many, many times and while tedious, it has always led to an answer in regards to this exact question.
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Stuart Luppescu
Sent: Tuesday, March 09, 2010 3:09 PM
To: ONKELINX, Thierry
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Size/metric of variance components in lme and lmer
On Tue, 2010-03-09 at 10:16 +0100, ONKELINX, Thierry wrote:
Dear Stuart,
I think that the "extra" variance is substracted from the fixed
effects. Which indicates that some of the information in your fixed
effects was due to the levels of tid.
But to make a fair comparison you should run both models with lmer.
And then compare both the random effects variances and the fixed
effect estimates.
OK. I copied function call from the cross-classified model with lmer and
removed the second random effect and reran it. Here are the random
effects tables:
Nested model:
Data: all.subj
AIC BIC logLik deviance REMLdev
3448318 3449229 -1724083 3448166 3448707
Random effects:
Groups Name Variance Std.Dev.
sid (Intercept) 0.77403 0.87979
Residual 0.85746 0.92599
Number of obs: 1185094, groups: sid, 122897
Total variance: 1.63
--------------------------------------------------
Cross-classified model:
Data: all.subj
AIC BIC logLik deviance REMLdev
3236856 3237779 -1618351 3236702 3237217
Random effects:
Groups Name Variance Std.Dev.
sid (Intercept) 0.75066 0.86641
tid (Intercept) 0.39171 0.62587
Residual 0.68583 0.82815
Number of obs: 1185094, groups: sid, 122897; tid, 8939
Total variance: 1.83
Could some of the difference be due to covariance between the random
effects?