var(ranef(Random Effect)) not the same as the variance component

12 messages · Simon Harmel, Ben Bolker, Dimitris Rizopoulos +1 more

Original

1

12

Simon Harmel

Mon, Sep 7, 2020 4:22 PM #

Hello All,

A very basic question. Generally, `var(ranef(Random Effect))` may not
necessarily be the same as the variance component reported for that Random
Effect in the model output, correct?


Thank you all,
Simon

Ben Bolker

Mon, Sep 7, 2020 4:25 PM #

Yes, that's correct.

 From 
https://stats.stackexchange.com/questions/392283/interpreting-blups-or-varcorr-estimates-in-mixed-models/392307#392307

 > the covariance matrix of the empirical Bayes estimates obtained from 
ranef() is related to the covariance of this posterior distribution [of 
conditional modes/BLUPs] whereas VarCorr() is giving the D matrix, which 
is the covariance matrix of the prior distribution of the random 
effects. These are not the same.

On 9/7/20 7:22 PM, Simon Harmel wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Simon Harmel

Mon, Sep 7, 2020 5:53 PM #

Much appreciated, Ben. I will study those resources to better understand
the estimation process.

Thanks again,
Simon

On Mon, Sep 7, 2020 at 6:25 PM Ben Bolker <bbolker at gmail.com> wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Simon Harmel

Mon, Sep 7, 2020 6:22 PM #

Ben,

This might seem irrelevant to my previous question, but it is from the post
you linked in your previous answer. So, is it correct language to say:

By including Random-Effects (e.g., random intercepts) of some subjects, we
are **controlling/adjusting/holding constant** subjects's random variations
in that random-effect (e.g., variation in subjects' initial status)?

On Mon, Sep 7, 2020 at 7:53 PM Simon Harmel <sim.harmel at gmail.com> wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Mon, Sep 7, 2020 6:24 PM #

On 9/7/20 9:22 PM, Simon Harmel wrote:

I would probably say something like "incorporating among-subject 
variation in that term (e.g., the initial status) in the model". Or 
"accounting for".

        _______________________________________________
        R-sig-mixed-models at r-project.org
        <mailto:R-sig-mixed-models at r-project.org> mailing list
        https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Simon Harmel

Mon, Sep 7, 2020 6:35 PM #

Sure, I felt like "accounting for" might seem a bit vague, could we also
say "after leaving aside the variation among subjects' initial status, ..."?

On Mon, Sep 7, 2020 at 8:24 PM Ben Bolker <bbolker at gmail.com> wrote:


On 9/7/20 9:22 PM, Simon Harmel wrote:

Ben,

This might seem irrelevant to my previous question, but it is from the
post you linked in your previous answer. So, is it correct language to

say:

By including Random-Effects (e.g., random intercepts) of some
subjects, we are **controlling/adjusting/holding constant**
subjects's random variations in that random-effect (e.g., variation in
subjects' initial status)?

    I would probably say something like "incorporating among-subject
variation in that term (e.g., the initial status) in the model". Or
"accounting for".

On Mon, Sep 7, 2020 at 7:53 PM Simon Harmel <sim.harmel at gmail.com
<mailto:sim.harmel at gmail.com>> wrote:

    Much appreciated, Ben. I will study those resources to better
    understand the estimation process.

    Thanks again,
    Simon

    On Mon, Sep 7, 2020 at 6:25 PM Ben Bolker <bbolker at gmail.com
    <mailto:bbolker at gmail.com>> wrote:

            Yes, that's correct.

          From

https://stats.stackexchange.com/questions/392283/interpreting-blups-or-varcorr-estimates-in-mixed-models/392307#392307

          > the covariance matrix of the empirical Bayes estimates

        obtained from
        ranef() is related to the covariance of this posterior
        distribution [of
        conditional modes/BLUPs] whereas VarCorr() is giving the D
        matrix, which
        is the covariance matrix of the prior distribution of the random
        effects. These are not the same.

        On 9/7/20 7:22 PM, Simon Harmel wrote:

         > Hello All,
         >
         > A very basic question. Generally, `var(ranef(Random Effect))`

        may not

         > necessarily be the same as the variance component reported

        for that Random

         > Effect in the model output, correct?
         >
         >
         > Thank you all,
         > Simon
         >
         >       [[alternative HTML version deleted]]
         >
         > _______________________________________________
         > R-sig-mixed-models at r-project.org

        <mailto:R-sig-mixed-models at r-project.org> mailing list

         > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
         >

        _______________________________________________
        R-sig-mixed-models at r-project.org
        <mailto:R-sig-mixed-models at r-project.org> mailing list
        https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Dimitris Rizopoulos

Mon, Sep 7, 2020 8:02 PM #

Yes, you do not expect these two be the same. The variance components are the prior variances of the random effects, whereas var(ranef(model)) is the variance of the posterior means/modes of the random effects.

Best,
Dimitris

??
Dimitris Rizopoulos
Professor of Biostatistics
Erasmus University Medical Center
The Netherlands

Harold Doran

Tue, Sep 8, 2020 1:47 PM #

To add a little notation to this, we can use law of total variance, var(y) = E(var(Y|X)) + var(E(Y|X)). The conditional means of the random effects are E(Y|X) and hence their variance is only one portion of the total variance. 

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> On Behalf Of D. Rizopoulos
Sent: Monday, September 7, 2020 11:02 PM
To: Simon Harmel <sim.harmel at gmail.com>; r-sig-mixed-models <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance component

External email alert: Be wary of links & attachments.

Yes, you do not expect these two be the same. The variance components are the prior variances of the random effects, whereas var(ranef(model)) is the variance of the posterior means/modes of the random effects.

Best,
Dimitris

Dimitris Rizopoulos
Professor of Biostatistics
Erasmus University Medical Center
The Netherlands

Simon Harmel

Wed, Sep 9, 2020 8:53 AM #

First of all, thank you all for your valuable input.

Dimitris,

Thank you I upvoted your answer on CV as well. But please help me
understand a few things.

1- By D matrix, you mean the G matrix shown in:
https://bookdown.org/marklhc/notes/simulating-multilevel-data.html#linear-growth-model

2- When you say variance components in the output are prior values, can you
tell me how these prior values are obtained? I guess from the data itself,
but how exactly (do we run individual models first to see how much
intercepts and slopes vary & co-vary and take those as prior)?

3- Harold above noted that: "The conditional means of the random effects
are E(Y|X) and hence their variance is only one portion of the total
variance [i.e., var(y)]." I'm not sure how this directly relates to my
question in this thread?

Thank you,
Simon




On Tue, Sep 8, 2020 at 3:47 PM Harold Doran <

harold.doran at cambiumassessment.com> wrote:

________________________________
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on
behalf of Simon Harmel <sim.harmel at gmail.com>
Sent: Tuesday, September 8, 2020 1:22:15 AM
To: r-sig-mixed-models <r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance
component

Hello All,

A very basic question. Generally, `var(ranef(Random Effect))` may not
necessarily be the same as the variance component reported for that Random
Effect in the model output, correct?


Thank you all,
Simon

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list

https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&amp;data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cae4132330fbd412dea9508d85384efba%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637351177705805772&amp;sdata=agUgmCzM5ecsaoGLm8aPX0%2FuHZF1mK%2BXbP%2Fi6KX5UvI%3D&amp;reserved=0

        [[alternative HTML version deleted]]

Harold Doran

Wed, Sep 9, 2020 11:48 AM #

Simon

Here is an example to show what my notation implies with respect to your question:

fm1 <- lmer(Reaction ~ 1 + (1|Subject), sleepstudy)

sqrt(var(ranef(fm1)$Subject) + mean(sapply(attr(ranef(fm1, condVar=TRUE)[[1]], "postVar"),function(x) x)))

From: Simon Harmel <sim.harmel at gmail.com>
Sent: Wednesday, September 9, 2020 11:53 AM
To: D. Rizopoulos <d.rizopoulos at erasmusmc.nl>
Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org>; Harold Doran <harold.doran at cambiumassessment.com>; Ben Bolker <bbolker at gmail.com>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance component

First of all, thank you all for your valuable input.

Dimitris,

Thank you I upvoted your answer on CV as well. But please help me understand a few things.

1- By D matrix, you mean the G matrix shown in: https://bookdown.org/marklhc/notes/simulating-multilevel-data.html#linear-growth-model

2- When you say variance components in the output are prior values, can you tell me how these prior values are obtained? I guess from the data itself, but how exactly (do we run individual models first to see how much intercepts and slopes vary & co-vary and take those as prior)?

3- Harold above noted that: "The conditional means of the random effects are E(Y|X) and hence their variance is only one portion of the total variance [i.e., var(y)]." I'm not sure how this directly relates to my question in this thread?

Thank you,
Simon

On Tue, Sep 8, 2020 at 3:47 PM Harold Doran <harold.doran at cambiumassessment.com<mailto:harold.doran at cambiumassessment.com>> wrote:

To add a little notation to this, we can use law of total variance, var(y) = E(var(Y|X)) + var(E(Y|X)). The conditional means of the random effects are E(Y|X) and hence their variance is only one portion of the total variance.

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>> On Behalf Of D. Rizopoulos
Sent: Monday, September 7, 2020 11:02 PM
To: Simon Harmel <sim.harmel at gmail.com<mailto:sim.harmel at gmail.com>>; r-sig-mixed-models <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance component

External email alert: Be wary of links & attachments.

Yes, you do not expect these two be the same. The variance components are the prior variances of the random effects, whereas var(ranef(model)) is the variance of the posterior means/modes of the random effects.

Best,
Dimitris

Dimitris Rizopoulos
Professor of Biostatistics
Erasmus University Medical Center
The Netherlands

Simon Harmel

Wed, Sep 9, 2020 1:18 PM #

Thank you, Harold.

1) `var(ranef(fm1)$Subject)` is the posterior variance of random effects or
their prior variance? (this was the point discussed so far in this thread)

2) Also, what about `mean(sapply(attr(ranef(fm1)$Subject,
"postVar"),function(x) x))`, what this expected variance represents in
words?

3) What does their sum represent? The total observed variance in random
deviations in intercepts of subjects?

Thank you very much

On Wed, Sep 9, 2020 at 1:48 PM Harold Doran <

harold.doran at cambiumassessment.com> wrote:

Simon



Here is an example to show what my notation implies with respect to your
question:



fm1 <- lmer(Reaction ~ 1 + (1|Subject), sleepstudy)



sqrt(var(ranef(fm1)$Subject) + mean(sapply(attr(ranef(fm1,
condVar=TRUE)[[1]], "postVar"),function(x) x)))



*From:* Simon Harmel <sim.harmel at gmail.com>
*Sent:* Wednesday, September 9, 2020 11:53 AM
*To:* D. Rizopoulos <d.rizopoulos at erasmusmc.nl>
*Cc:* r-sig-mixed-models <r-sig-mixed-models at r-project.org>; Harold Doran
<harold.doran at cambiumassessment.com>; Ben Bolker <bbolker at gmail.com>
*Subject:* Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the
variance component



First of all, thank you all for your valuable input.



Dimitris,



Thank you I upvoted your answer on CV as well. But please help me
understand a few things.



1- By D matrix, you mean the G matrix shown in:
https://bookdown.org/marklhc/notes/simulating-multilevel-data.html#linear-growth-model



2- When you say variance components in the output are prior values, can
you tell me how these prior values are obtained? I guess from the data
itself, but how exactly (do we run individual models first to see how much
intercepts and slopes vary & co-vary and take those as prior)?



3- Harold above noted that: "The conditional means of the random effects
are E(Y|X) and hence their variance is only one portion of the total
variance [i.e., var(y)]." I'm not sure how this directly relates to my
question in this thread?



Thank you,

Simon









On Tue, Sep 8, 2020 at 3:47 PM Harold Doran <
harold.doran at cambiumassessment.com> wrote:

To add a little notation to this, we can use law of total variance, var(y)
= E(var(Y|X)) + var(E(Y|X)). The conditional means of the random effects
are E(Y|X) and hence their variance is only one portion of the total
variance.

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> On
Behalf Of D. Rizopoulos
Sent: Monday, September 7, 2020 11:02 PM
To: Simon Harmel <sim.harmel at gmail.com>; r-sig-mixed-models <
r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the
variance component

External email alert: Be wary of links & attachments.


Yes, you do not expect these two be the same. The variance components are
the prior variances of the random effects, whereas var(ranef(model)) is the
variance of the posterior means/modes of the random effects.

Best,
Dimitris


Dimitris Rizopoulos
Professor of Biostatistics
Erasmus University Medical Center
The Netherlands

________________________________
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on
behalf of Simon Harmel <sim.harmel at gmail.com>
Sent: Tuesday, September 8, 2020 1:22:15 AM
To: r-sig-mixed-models <r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance
component

Hello All,

A very basic question. Generally, `var(ranef(Random Effect))` may not
necessarily be the same as the variance component reported for that Random
Effect in the model output, correct?


Thank you all,
Simon

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list

https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&amp;data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7Cae4132330fbd412dea9508d85384efba%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637351177705805772&amp;sdata=agUgmCzM5ecsaoGLm8aPX0%2FuHZF1mK%2BXbP%2Fi6KX5UvI%3D&amp;reserved=0

        [[alternative HTML version deleted]]

Harold Doran

Wed, Sep 9, 2020 1:45 PM #

Simon

Your original question was, ?A very basic question. Generally, `var(ranef(Random Effect))` may not necessarily be the same as the variance component reported for that Random Effect in the model output, correct??

First, let me define terms since there are a lot of variances in this model. There is the ?marginal variance? which is the variance of the random effects reported by lmer. Then there is the ?conditional variance of the random effects?, which is the variance of each individual BLUP, then you can take the ?observed variance? over all BLUPs.

Now, take the simple model used in my example,

fm1 <- lmer(Reaction ~ 1 + (1|Subject), sleepstudy)

So, this term, var(ranef(fm1)$Subject), is the observed variance of the BLUPs.

The marginal variance of the random effects in this model reported by lmer is 35.75^2 and as Dimitris pointed out, you would not expect the observed variance of the conditional means (i.e., the BLUPs) to be equal to the marginal variance of the random effects.

The random effects are conditional means, E(Y|X) and they have a conditional variance.  So:

E(var(Y|X)) = mean(sapply(attr(ranef(fm1)$Subject, "postVar"),function(x) x))

var(E(Y|X) = observed variance of the BLUPs

Combined, their sum is the marginal variance of the random effects. So, concretely, we observe that

(Intercept)
(Intercept)    35.75385

Which is the same as the marginal SD of the random effects from lmer.

Hope this helps.

BTW, note to Ben Bolker (if he even read this far ? ), I was originally the one who suggested to Doug that there be an extractor called ?postVar?. He regretted that term and expressed that to me many times (now I understand why). But, I had to remember how to get the conditional variances of the random effects and the code above is what I came up with since they are an attribute of the ranef extractor. That code is a bit ?ugly?, so either I?m doing it wrong (big possibility) or might I suggest a new extractor function can be written that is easier to get those?

From: Simon Harmel <sim.harmel at gmail.com>
Sent: Wednesday, September 9, 2020 4:19 PM
To: Harold Doran <harold.doran at cambiumassessment.com>
Cc: D. Rizopoulos <d.rizopoulos at erasmusmc.nl>; r-sig-mixed-models <r-sig-mixed-models at r-project.org>; Ben Bolker <bbolker at gmail.com>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance component

Thank you, Harold.

1) `var(ranef(fm1)$Subject)` is the posterior variance of random effects or their prior variance? (this was the point discussed so far in this thread)

2) Also, what about `mean(sapply(attr(ranef(fm1)$Subject, "postVar"),function(x) x))`, what this expected variance represents in words?

3) What does their sum represent? The total observed variance in random deviations in intercepts of subjects?

Thank you very much

On Wed, Sep 9, 2020 at 1:48 PM Harold Doran <harold.doran at cambiumassessment.com<mailto:harold.doran at cambiumassessment.com>> wrote:

Simon

Here is an example to show what my notation implies with respect to your question:

fm1 <- lmer(Reaction ~ 1 + (1|Subject), sleepstudy)

sqrt(var(ranef(fm1)$Subject) + mean(sapply(attr(ranef(fm1, condVar=TRUE)[[1]], "postVar"),function(x) x)))

From: Simon Harmel <sim.harmel at gmail.com<mailto:sim.harmel at gmail.com>>
Sent: Wednesday, September 9, 2020 11:53 AM
To: D. Rizopoulos <d.rizopoulos at erasmusmc.nl<mailto:d.rizopoulos at erasmusmc.nl>>
Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>; Harold Doran <harold.doran at cambiumassessment.com<mailto:harold.doran at cambiumassessment.com>>; Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com>>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance component

First of all, thank you all for your valuable input.

Dimitris,

Thank you I upvoted your answer on CV as well. But please help me understand a few things.

1- By D matrix, you mean the G matrix shown in: https://bookdown.org/marklhc/notes/simulating-multilevel-data.html#linear-growth-model

2- When you say variance components in the output are prior values, can you tell me how these prior values are obtained? I guess from the data itself, but how exactly (do we run individual models first to see how much intercepts and slopes vary & co-vary and take those as prior)?

3- Harold above noted that: "The conditional means of the random effects are E(Y|X) and hence their variance is only one portion of the total variance [i.e., var(y)]." I'm not sure how this directly relates to my question in this thread?

Thank you,
Simon

On Tue, Sep 8, 2020 at 3:47 PM Harold Doran <harold.doran at cambiumassessment.com<mailto:harold.doran at cambiumassessment.com>> wrote:

To add a little notation to this, we can use law of total variance, var(y) = E(var(Y|X)) + var(E(Y|X)). The conditional means of the random effects are E(Y|X) and hence their variance is only one portion of the total variance.

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>> On Behalf Of D. Rizopoulos
Sent: Monday, September 7, 2020 11:02 PM
To: Simon Harmel <sim.harmel at gmail.com<mailto:sim.harmel at gmail.com>>; r-sig-mixed-models <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>
Subject: Re: [R-sig-ME] var(ranef(Random Effect)) not the same as the variance component

External email alert: Be wary of links & attachments.

Yes, you do not expect these two be the same. The variance components are the prior variances of the random effects, whereas var(ranef(model)) is the variance of the posterior means/modes of the random effects.

Best,
Dimitris

Dimitris Rizopoulos
Professor of Biostatistics
Erasmus University Medical Center
The Netherlands