Skip to content

Standard Error of a coef. in a 2-level model vs. 2 OLS models

8 messages · Simon Harmel, Harold Doran, Phillip Alday

#
Dear All,

I have fit two ols models (ols1 & ols2) and an mixed-effects model (m1).
ols1 is a simple lm() model that ignores the second-level. ols2 is a simple
lm() model that ignores the first-level.

For `ols1` model,  `sigma(ols1)^2` almost equals sum of variance components
in the `m1` model: 6.68 (bet.) + 39.15 (with.)
For `ols2` model, I wonder what does `sigma(ols2)^2` represents when
compared to the `m1` model?

Here is the fully reproducible code:

library(lme4)
library(tidyverse)

hsb <- read.csv('
https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
hsb_ave <- hsb %>% group_by(sch.id) %>% mutate(math_ave = mean(math)) %>%
slice(1) # data that only considers grouping but ignores lower level

ols1 <- lm(math ~ sector, data = hsb)
summary(ols1)

m1 <- lmer(math ~ sector + (1|sch.id), data = hsb)
summary(m1)

# `sigma(ols1)^2` almost equals 6.68 (bet.) + 39.15 (with.) from lmer

But if I fit another ols model that only considers the grouping structure
(ignoring lower level):

ols2 <- lm(math_ave ~ sector, data = hsb_ave)
summary(ols2)

Then what does `sigma(ols2)^2` should amount to when compared to the `m1`
model?
#
Just a clarification.

For `ols1` model, I can approximate its SE of the sector coefficient by
using the within and between variance components from the HLM model:
sqrt(( 6.68  + 39.15  )/45)/(160*.25))

BUT  For `ols2` model, how can I approximate its SE of the sector
coefficient by using the within and between variance components from the
HLM model?
On Sun, Sep 13, 2020 at 6:37 PM Simon Harmel <sim.harmel at gmail.com> wrote:

            

  
  
2 days later
#
This is not how standard errors are computed for linear or mixed linear models. I'm not sure what you're goal is, but the SEs are the square roots of the diagonal elements of the variance/covariance matrix of the fixed effects.

See ?vcov on how to extract that matrix.

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> On Behalf Of Simon Harmel
Sent: Sunday, September 13, 2020 7:51 PM
To: r-sig-mixed-models <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Standard Error of a coef. in a 2-level model vs. 2 OLS models

External email alert: Be wary of links & attachments.


Just a clarification.

For `ols1` model, I can approximate its SE of the sector coefficient by using the within and between variance components from the HLM model:
sqrt(( 6.68  + 39.15  )/45)/(160*.25))

BUT  For `ols2` model, how can I approximate its SE of the sector coefficient by using the within and between variance components from the HLM model?
On Sun, Sep 13, 2020 at 6:37 PM Simon Harmel <sim.harmel at gmail.com> wrote:

            
_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
Hi Harold,

I improved my question, and asked it on CrossValidated: (
https://stats.stackexchange.com/questions/487363/using-variance-components-of-a-mixed-model-to-obtain-std-error-of-a-coef-from-a
)

I would appreciate your answer, either here or on CrossValidated.

Many thanks, Simon

On Wed, Sep 16, 2020 at 4:45 PM Harold Doran <
harold.doran at cambiumassessment.com> wrote:

            

  
  
#
Simon

Crossposting like this is frowned on a bit, but I?ve been in your shoes trying to get an answer before. I think you might be a bit confused. I saw your questions in both places and you?re asking how to get the standard errors of the OLS model using by fitting a mixed model and using the variance components from that mixed model to get the standard errors from an OLS model.

This is what we refer to in statistics as ?bass-ackwards?. ?

If you want the standard errors of the fixed effects from an OLS model, compute them as follows:

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)
X <- model.matrix(lm.D9)

### Compute Standard errors of fixed effects
sqrt(diag(solve(crossprod(X)) * .6964^2))

### use built in extractor function to get them instead
sqrt(diag(vcov(lm.D9)))

Similarly, get the standard errors of the mixed model from its own variance/covariance matrix.


From: Simon Harmel <sim.harmel at gmail.com>
Sent: Wednesday, September 16, 2020 5:54 PM
To: Harold Doran <harold.doran at cambiumassessment.com>
Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Standard Error of a coef. in a 2-level model vs. 2 OLS models

Hi Harold,

I improved my question, and asked it on CrossValidated: (https://stats.stackexchange.com/questions/487363/using-variance-components-of-a-mixed-model-to-obtain-std-error-of-a-coef-from-a)

I would appreciate your answer, either here or on CrossValidated.

Many thanks, Simon
On Wed, Sep 16, 2020 at 4:45 PM Harold Doran <harold.doran at cambiumassessment.com<mailto:harold.doran at cambiumassessment.com>> wrote:
This is not how standard errors are computed for linear or mixed linear models. I'm not sure what you're goal is, but the SEs are the square roots of the diagonal elements of the variance/covariance matrix of the fixed effects.

See ?vcov on how to extract that matrix.

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>> On Behalf Of Simon Harmel
Sent: Sunday, September 13, 2020 7:51 PM
To: r-sig-mixed-models <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>
Subject: Re: [R-sig-ME] Standard Error of a coef. in a 2-level model vs. 2 OLS models

External email alert: Be wary of links & attachments.


Just a clarification.

For `ols1` model, I can approximate its SE of the sector coefficient by using the within and between variance components from the HLM model:
sqrt(( 6.68  + 39.15  )/45)/(160*.25))

BUT  For `ols2` model, how can I approximate its SE of the sector coefficient by using the within and between variance components from the HLM model?
On Sun, Sep 13, 2020 at 6:37 PM Simon Harmel <sim.harmel at gmail.com<mailto:sim.harmel at gmail.com>> wrote:

            
_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
Thanks Harold! I may have been insufficiently clear.

At its core, the question is asking: "Is the "between-cluster" variation
(i.e., ?00) in a two-level nested mixed-effects model (i.e., `m1`), the
same as the amount of "total variation" that we would otherwise get if we
only use the cluster-level data in a corresponding, NON-HLM regression
model (i.e., `ols2`)?"


library(lme4)
library(tidyverse)

hsb <- read.csv('
https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
hsb_ave <- hsb %>% group_by(sch.id) %>% mutate(math_ave = mean(math)) %>%
slice(1)

ols2 <- lm(math_ave ~ sector, data = hsb_ave)

m1 <- lmer(math ~ sector + (1|sch.id), data = hsb)

On Thu, Sep 17, 2020 at 11:16 AM Harold Doran <
harold.doran at cambiumassessment.com> wrote:

            

  
  
#
The answer is no.

Consider the case where the between-cluster variation goes to zero (i.e.
a singular fit), but the residual variation is not zero. The
between-cluster variation is clearly not the same as the "total
variation" / residual variation in a non mixed OLS regression on the
data aggregated within clusters.

The other way to see this is that lme4 actually uses the scaled random
effects, i.e. the variance relative to the residual variance, for
fitting. (This what the entries in the theta vector reflect: the lower
Cholesky factor of the variance-covariance matrix of the scaled random
effects.)

Best,

Phillip
On 17/09/2020 19:15, Simon Harmel wrote:
#
Thanks Phillip!

In other words, when we use the full data in a regular ols model (`ols1`
below), then the residual variation [i.e.,  sigma(ols1)^2] approximately
equals the sum of within- and between-cluster variation from a
corresponding mixed model (`m1` below).

But if we use cluster-level data in a regular ols model (`ols2` below), no
correspondence between the amounts of variation between the `ols2` model
and the corresponding mixed model (`m1` below) can be found?

library(lme4)
library(tidyverse)

hsb <- read.csv('
https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
hsb_ave <- hsb %>% group_by(sch.id) %>% mutate(math_ave = mean(math)) %>%
slice(1)

ols1 <- lm(math ~ sector, data = hsb)
summary(ols1)

m1 <- lmer(math ~ sector + (1|sch.id), data = hsb)
summary(m1)

ols2 <- lm(math_ave ~ sector, data = hsb_ave)
summary(ols2)
On Thu, Sep 17, 2020 at 12:23 PM Phillip Alday <phillip.alday at mpi.nl> wrote: