Dear All, Most MLM packages (e.g., HLM, SPSS, SAS, STATA) provide a p-value for the variance components. My understanding based on ( https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects) is that this is not possible to achieve in R, right? If not, for my 4 models below, I assume I need to compare, using anova(), each model against its OLS equivalent to obtain a likelihood ratio test p-value for each model's variance component, correct? hsb <- read.csv(' https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv') library(lme4) m1 <- lmer(math ~ 1 + (1|sch.id), data = hsb) m2 <- lmer(math ~ meanses + (1|sch.id), data = hsb) m3 <- lmer(math ~ ses + (ses | sch.id), data = hsb) m4 <- lmer(math~ ses * meanses + (ses | sch.id ), data = hsb) ols1 <- lm(math ~ 1, data = hsb) ols2 <- lm(math ~ meanses, data = hsb) ols3 <- lm(math ~ ses, data = hsb) ols4 <- lm(math ~ ses * meanses, data = hsb)
Statistical significance of random-effects (lme4 or others)
11 messages · Victor Oribamise, Simon Harmel, Daniel Lüdecke +3 more
Hey Simon, You can check the lsmeans package in R, you can obtain p values for your models using the package Victor
On Sun, Sep 6, 2020 at 9:38 PM Simon Harmel <sim.harmel at gmail.com> wrote:
Dear All, Most MLM packages (e.g., HLM, SPSS, SAS, STATA) provide a p-value for the variance components. My understanding based on ( https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects ) is that this is not possible to achieve in R, right? If not, for my 4 models below, I assume I need to compare, using anova(), each model against its OLS equivalent to obtain a likelihood ratio test p-value for each model's variance component, correct? hsb <- read.csv(' https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv') library(lme4) m1 <- lmer(math ~ 1 + (1|sch.id), data = hsb) m2 <- lmer(math ~ meanses + (1|sch.id), data = hsb) m3 <- lmer(math ~ ses + (ses | sch.id), data = hsb) m4 <- lmer(math~ ses * meanses + (ses | sch.id ), data = hsb) ols1 <- lm(math ~ 1, data = hsb) ols2 <- lm(math ~ meanses, data = hsb) ols3 <- lm(math ~ ses, data = hsb) ols4 <- lm(math ~ ses * meanses, data = hsb) [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Hi Victor, Thanks for your response. First, as far as I know "lsmeans" has now become "emmeans". Second, all my data and code is 100% reproducible, would you please let me know how can I possibly obtain the p-value for the random-effects' variance components in any of the 4 models I showed in my original question? Thanks, Simon On Sun, Sep 6, 2020 at 9:42 PM Victor Oribamise <victor.oribamise at gmail.com> wrote:
Hey Simon, You can check the lsmeans package in R, you can obtain p values for your models using the package Victor On Sun, Sep 6, 2020 at 9:38 PM Simon Harmel <sim.harmel at gmail.com> wrote:
Dear All, Most MLM packages (e.g., HLM, SPSS, SAS, STATA) provide a p-value for the variance components. My understanding based on ( https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects ) is that this is not possible to achieve in R, right? If not, for my 4 models below, I assume I need to compare, using anova(), each model against its OLS equivalent to obtain a likelihood ratio test p-value for each model's variance component, correct? hsb <- read.csv(' https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv') library(lme4) m1 <- lmer(math ~ 1 + (1|sch.id), data = hsb) m2 <- lmer(math ~ meanses + (1|sch.id), data = hsb) m3 <- lmer(math ~ ses + (ses | sch.id), data = hsb) m4 <- lmer(math~ ses * meanses + (ses | sch.id ), data = hsb) ols1 <- lm(math ~ 1, data = hsb) ols2 <- lm(math ~ meanses, data = hsb) ols3 <- lm(math ~ ses, data = hsb) ols4 <- lm(math ~ ses * meanses, data = hsb) [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dear Victor,
I'm looking for the p-value for the "variance components", the variance (or
sd) estimated for random-effects in the 4 models I showed?
For example, for m1 I'm looking for the p-value for the terms shown below.
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ 1 + (1 | sch.id)
Data: hsb
REML criterion at convergence: 47116.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0631 -0.7539 0.0267 0.7606 2.7426
Random effects: *************
Groups Name Variance Std.Dev.
sch.id (Intercept) 8.614 2.935 *****P-VALUE HERE?****
Residual 39.148 6.257 **** P_VALUE HERE? ****
Number of obs: 7185, groups: sch.id, 160
On Sun, Sep 6, 2020 at 10:15 PM Simon Harmel <sim.harmel at gmail.com> wrote:
Hi Victor, Thanks for your response. First, as far as I know "lsmeans" has now become "emmeans". Second, all my data and code is 100% reproducible, would you please let me know how can I possibly obtain the p-value for the random-effects' variance components in any of the 4 models I showed in my original question? Thanks, Simon On Sun, Sep 6, 2020 at 9:42 PM Victor Oribamise < victor.oribamise at gmail.com> wrote:
Hey Simon, You can check the lsmeans package in R, you can obtain p values for your models using the package Victor On Sun, Sep 6, 2020 at 9:38 PM Simon Harmel <sim.harmel at gmail.com> wrote:
Dear All, Most MLM packages (e.g., HLM, SPSS, SAS, STATA) provide a p-value for the variance components. My understanding based on ( https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects ) is that this is not possible to achieve in R, right? If not, for my 4 models below, I assume I need to compare, using anova(), each model against its OLS equivalent to obtain a likelihood ratio test p-value for each model's variance component, correct? hsb <- read.csv(' https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv') library(lme4) m1 <- lmer(math ~ 1 + (1|sch.id), data = hsb) m2 <- lmer(math ~ meanses + (1|sch.id), data = hsb) m3 <- lmer(math ~ ses + (ses | sch.id), data = hsb) m4 <- lmer(math~ ses * meanses + (ses | sch.id ), data = hsb) ols1 <- lm(math ~ 1, data = hsb) ols2 <- lm(math ~ meanses, data = hsb) ols3 <- lm(math ~ ses, data = hsb) ols4 <- lm(math ~ ses * meanses, data = hsb) [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
A non-statistician's two cents: 1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for models fit using REML (rather than MLE). The anova() function seems to agree, given that its present version (4.0.2) refits the models using MLE in order to compare their deviances. 2. Even when the models have been fit using MLE, likelihood-ratio tests for variance components are only applicable in cases of a single variance component. In your case, this means a LRT can only be used for *m1 vs ols1* and *m2 vs ols2*. There, you simply divide the p-value reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are obviously extremely statistically significant. However, models *m3 *and *m4* both have two random effects. The last time I checked, the default assumption of a chi-squared deviance is no longer applicable in such cases, so the p-values reported by Stata and SPSS are only approximate and tend to be too conservative. Perhaps you might apply an information criterion instead, such as the AIC <https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-effect> . Best, J ma 7. syysk. 2020 klo 6.48 Simon Harmel (sim.harmel at gmail.com) kirjoitti:
Dear Victor,
I'm looking for the p-value for the "variance components", the variance (or
sd) estimated for random-effects in the 4 models I showed?
For example, for m1 I'm looking for the p-value for the terms shown below.
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ 1 + (1 | sch.id)
Data: hsb
REML criterion at convergence: 47116.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0631 -0.7539 0.0267 0.7606 2.7426
Random effects: *************
Groups Name Variance Std.Dev.
sch.id (Intercept) 8.614 2.935 *****P-VALUE HERE?****
Residual 39.148 6.257 **** P_VALUE HERE? ****
Number of obs: 7185, groups: sch.id, 160
On Sun, Sep 6, 2020 at 10:15 PM Simon Harmel <sim.harmel at gmail.com> wrote:
Hi Victor, Thanks for your response. First, as far as I know "lsmeans" has now
become
"emmeans". Second, all my data and code is 100% reproducible, would you please let
me
know how can I possibly obtain the p-value for the random-effects'
variance
components in any of the 4 models I showed in my original question? Thanks, Simon On Sun, Sep 6, 2020 at 9:42 PM Victor Oribamise < victor.oribamise at gmail.com> wrote:
Hey Simon, You can check the lsmeans package in R, you can obtain p values for your models using the package Victor On Sun, Sep 6, 2020 at 9:38 PM Simon Harmel <sim.harmel at gmail.com>
wrote:
Dear All, Most MLM packages (e.g., HLM, SPSS, SAS, STATA) provide a p-value for
the
variance components. My understanding based on (
) is that this is not possible to achieve in R, right? If not, for my 4 models below, I assume I need to compare, using
anova(),
each model against its OLS equivalent to obtain a likelihood ratio test
p-value for each model's variance component, correct?
hsb <- read.csv('
https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
library(lme4)
m1 <- lmer(math ~ 1 + (1|sch.id), data = hsb)
m2 <- lmer(math ~ meanses + (1|sch.id), data = hsb)
m3 <- lmer(math ~ ses + (ses | sch.id), data = hsb)
m4 <- lmer(math~ ses * meanses + (ses | sch.id ), data = hsb)
ols1 <- lm(math ~ 1, data = hsb)
ols2 <- lm(math ~ meanses, data = hsb)
ols3 <- lm(math ~ ses, data = hsb)
ols4 <- lm(math ~ ses * meanses, data = hsb)
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dear J, My goal is not to do any comparison between any models. Rather, for each model I want to know if the variance component is different from 0 or not. And what is a p-value for that. On Sun, Sep 6, 2020 at 11:21 PM Juho Kristian Ruohonen <
juho.kristian.ruohonen at gmail.com> wrote:
A non-statistician's two cents: 1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for models fit using REML (rather than MLE). The anova() function seems to agree, given that its present version (4.0.2) refits the models using MLE in order to compare their deviances. 2. Even when the models have been fit using MLE, likelihood-ratio tests for variance components are only applicable in cases of a single variance component. In your case, this means a LRT can only be used for *m1 vs ols1* and *m2 vs ols2*. There, you simply divide the p-value reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are obviously extremely statistically significant. However, models *m3 *and *m4* both have two random effects. The last time I checked, the default assumption of a chi-squared deviance is no longer applicable in such cases, so the p-values reported by Stata and SPSS are only approximate and tend to be too conservative. Perhaps you might apply an information criterion instead, such as the AIC <https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-effect> . Best, J ma 7. syysk. 2020 klo 6.48 Simon Harmel (sim.harmel at gmail.com) kirjoitti:
Dear Victor,
I'm looking for the p-value for the "variance components", the variance
(or
sd) estimated for random-effects in the 4 models I showed?
For example, for m1 I'm looking for the p-value for the terms shown below.
Linear mixed model fit by REML ['lmerMod']
Formula: math ~ 1 + (1 | sch.id)
Data: hsb
REML criterion at convergence: 47116.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0631 -0.7539 0.0267 0.7606 2.7426
Random effects: *************
Groups Name Variance Std.Dev.
sch.id (Intercept) 8.614 2.935 *****P-VALUE HERE?****
Residual 39.148 6.257 **** P_VALUE HERE? ****
Number of obs: 7185, groups: sch.id, 160
On Sun, Sep 6, 2020 at 10:15 PM Simon Harmel <sim.harmel at gmail.com>
wrote:
Hi Victor, Thanks for your response. First, as far as I know "lsmeans" has now
become
"emmeans". Second, all my data and code is 100% reproducible, would you please let
me
know how can I possibly obtain the p-value for the random-effects'
variance
components in any of the 4 models I showed in my original question? Thanks, Simon On Sun, Sep 6, 2020 at 9:42 PM Victor Oribamise < victor.oribamise at gmail.com> wrote:
Hey Simon, You can check the lsmeans package in R, you can obtain p values for
your
models using the package Victor On Sun, Sep 6, 2020 at 9:38 PM Simon Harmel <sim.harmel at gmail.com>
wrote:
Dear All, Most MLM packages (e.g., HLM, SPSS, SAS, STATA) provide a p-value for
the
variance components. My understanding based on (
) is that this is not possible to achieve in R, right? If not, for my 4 models below, I assume I need to compare, using
anova(),
each model against its OLS equivalent to obtain a likelihood ratio
test
p-value for each model's variance component, correct?
hsb <- read.csv('
https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
library(lme4)
m1 <- lmer(math ~ 1 + (1|sch.id), data = hsb)
m2 <- lmer(math ~ meanses + (1|sch.id), data = hsb)
m3 <- lmer(math ~ ses + (ses | sch.id), data = hsb)
m4 <- lmer(math~ ses * meanses + (ses | sch.id ), data = hsb)
ols1 <- lm(math ~ 1, data = hsb)
ols2 <- lm(math ~ meanses, data = hsb)
ols3 <- lm(math ~ ses, data = hsb)
ols4 <- lm(math ~ ses * meanses, data = hsb)
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Hi Simon, I'm not sure if this is a useful question. The variance can / should never be negative, and it usually is always above 0 if you have some variation in your outcome depending on the group factors (random effects). Packages I know that do some "significance testing" or uncertainty estimation of random effects are lmerTest::ranova() (quite well documented what it does) or "arm::se.ranef()" resp. "parameters::standard_error(effects = "random")". The two latter packages compute standard errors for the conditional modes of the random effects (what you get with "ranef()"). Best Daniel -----Urspr?ngliche Nachricht----- Von: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> Im Auftrag von Simon Harmel Gesendet: Montag, 7. September 2020 06:28 An: Juho Kristian Ruohonen <juho.kristian.ruohonen at gmail.com> Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org> Betreff: Re: [R-sig-ME] Statistical significance of random-effects (lme4 or others) Dear J, My goal is not to do any comparison between any models. Rather, for each model I want to know if the variance component is different from 0 or not. And what is a p-value for that. On Sun, Sep 6, 2020 at 11:21 PM Juho Kristian Ruohonen <
juho.kristian.ruohonen at gmail.com> wrote:
A non-statistician's two cents: 1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for models fit using REML (rather than MLE). The anova() function seems to agree, given that its present version (4.0.2) refits the models using
MLE
in order to compare their deviances. 2. Even when the models have been fit using MLE, likelihood-ratio tests for variance components are only applicable in cases of a single variance component. In your case, this means a LRT can only be used for
*m1
vs ols1* and *m2 vs ols2*. There, you simply divide the p-value reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are obviously extremely statistically significant. However, models *m3 *and *m4* both have two random effects. The last time I checked, the default assumption of a chi-squared deviance is no longer applicable in such cases, so the p-values reported by Stata and SPSS are only
approximate
and tend to be too conservative. Perhaps you might apply an information criterion instead, such as the AIC
<https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-m ixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff ect>
. Best, J
-- _____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING
While I agree with Daniel that an assumption of zero random-effect variance doesn't make much sense, I would point out that in my understanding, a simple LRT comparing a model with and without the random effect, where applicable, tests exactly that. The p-value divided by 2 reflects the probability of a deviance difference equal to or greater than the one observed between the models, under the null hypothesis that the variance component equals zero. A very low p/2 means a random-effect variance of 0 is very implausible. Best, J ma 7. syysk. 2020 klo 9.13 Daniel L?decke (d.luedecke at uke.de) kirjoitti:
Hi Simon, I'm not sure if this is a useful question. The variance can / should never be negative, and it usually is always above 0 if you have some variation in your outcome depending on the group factors (random effects). Packages I know that do some "significance testing" or uncertainty estimation of random effects are lmerTest::ranova() (quite well documented what it does) or "arm::se.ranef()" resp. "parameters::standard_error(effects = "random")". The two latter packages compute standard errors for the conditional modes of the random effects (what you get with "ranef()"). Best Daniel -----Urspr?ngliche Nachricht----- Von: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> Im Auftrag von Simon Harmel Gesendet: Montag, 7. September 2020 06:28 An: Juho Kristian Ruohonen <juho.kristian.ruohonen at gmail.com> Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org> Betreff: Re: [R-sig-ME] Statistical significance of random-effects (lme4 or others) Dear J, My goal is not to do any comparison between any models. Rather, for each model I want to know if the variance component is different from 0 or not. And what is a p-value for that. On Sun, Sep 6, 2020 at 11:21 PM Juho Kristian Ruohonen < juho.kristian.ruohonen at gmail.com> wrote:
A non-statistician's two cents: 1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for models fit using REML (rather than MLE). The anova() function seems to agree, given that its present version (4.0.2) refits the models using
MLE
in order to compare their deviances. 2. Even when the models have been fit using MLE, likelihood-ratio tests for variance components are only applicable in cases of a single variance component. In your case, this means a LRT can only be used
for *m1
vs ols1* and *m2 vs ols2*. There, you simply divide the p-value reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are obviously extremely statistically significant. However, models *m3
*and
*m4* both have two random effects. The last time I checked, the default assumption of a chi-squared deviance is no longer applicable
in
such cases, so the p-values reported by Stata and SPSS are only
approximate
and tend to be too conservative. Perhaps you might apply an
information
criterion instead, such as the AIC
< https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-m ixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff <https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff> ect>
. Best, J
--
_____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING
I'm replying to the last email in the thread, but I'm addressing several issues raised along the way: * I'm not sure you should call it a _likelihood_ ratio test for REML fitted models given that REML isn't _the_ likelihood. (Every couple of months this topic comes up, so I won't go into it here.) * That said, you can actually compare REML fitted models *if the fixed effects are identical*. (This is actually part of why REML is viewed with some skepticism by some -- the computation of the objective is dependent on the parameterization of the fixed effects.) * In all cases, the usual tool for model comparison (LRT) require nested models. This is why it's tricky to test different random-effects structures against each other: it can be quite easy to have non nested models. * The p/2 for LRT on the random effects comes from the standard LRT being a two-sided test, but because variances are bounded at zero, you actually need a one-sided test. * The variance components can actually be zero; such singular fits are mathematically well defined. (Being able to estimate such models was one of the advances in lme4 compared to its predecessor nlme.) A estimate of zero doesn't mean that there is between group (e.g. here: between school) variance, but rather that the variance between schools is not distinguishable from the variability induced by the residual variance. That sounds quite technical, but in more practical terms means "a zero estimate means that you can't detect any variation between groups beyond the residual variation; the between-group variation may be truly zero or you may simply lack the power to detect it". * anova() will work for comparing lme4 fits to OLS fits, but the lme4 fit has to be first:
anova(m1, ols1)
refitting model(s) with ML (instead of REML)
Data: hsb
Models:
ols1: math ~ 1
m1: math ~ 1 + (1 | sch.id)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
ols1 2 48104 48117 -24050 48100
m1 3 47122 47142 -23558 47116 983.92 1 < 2.2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ?
(For the technically curious: you have to specify the lme4 object first
due to the way single-dispatch works in R.)
* I would advise against significance tests and standard errors for (the
estimates of) the random effects because their (sampling) distribution
can be highly skewed and significance tests don't capture this.
* Instead, you should use confidence intervals. Profile confidence
intervals are okay, but the gold standard are from the parametric
bootstrap. Here are two examples from the OP's original code:
confint(m1, method="profile", oldNames=FALSE)
Computing profile confidence intervals ...
2.5 % 97.5 %
sd_(Intercept)|sch.id 2.594729 3.315880
sigma 6.154803 6.361786
(Intercept) 12.156289 13.117121
confint(m1, method="boot", oldNames=FALSE)
Computing bootstrap confidence intervals ...
2.5 % 97.5 %
sd_(Intercept)|sch.id 2.551532 3.272484
sigma 6.151822 6.363059
(Intercept) 12.091033 13.123430
see ?confint.merMod for more details about other possible arguments.
* Finally, be very careful interpreting p-values. The correct
interpretation is very difficult and tricky.
Best,
Phillip
On 7/9/20 8:29 am, Juho Kristian Ruohonen wrote:
While I agree with Daniel that an assumption of zero random-effect variance doesn't make much sense, I would point out that in my understanding, a simple LRT comparing a model with and without the random effect, where applicable, tests exactly that. The p-value divided by 2 reflects the probability of a deviance difference equal to or greater than the one observed between the models, under the null hypothesis that the variance component equals zero. A very low p/2 means a random-effect variance of 0 is very implausible. Best, J ma 7. syysk. 2020 klo 9.13 Daniel L?decke (d.luedecke at uke.de) kirjoitti:
Hi Simon, I'm not sure if this is a useful question. The variance can / should never be negative, and it usually is always above 0 if you have some variation in your outcome depending on the group factors (random effects). Packages I know that do some "significance testing" or uncertainty estimation of random effects are lmerTest::ranova() (quite well documented what it does) or "arm::se.ranef()" resp. "parameters::standard_error(effects = "random")". The two latter packages compute standard errors for the conditional modes of the random effects (what you get with "ranef()"). Best Daniel -----Urspr?ngliche Nachricht----- Von: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> Im Auftrag von Simon Harmel Gesendet: Montag, 7. September 2020 06:28 An: Juho Kristian Ruohonen <juho.kristian.ruohonen at gmail.com> Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org> Betreff: Re: [R-sig-ME] Statistical significance of random-effects (lme4 or others) Dear J, My goal is not to do any comparison between any models. Rather, for each model I want to know if the variance component is different from 0 or not. And what is a p-value for that. On Sun, Sep 6, 2020 at 11:21 PM Juho Kristian Ruohonen < juho.kristian.ruohonen at gmail.com> wrote:
A non-statistician's two cents: 1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for models fit using REML (rather than MLE). The anova() function seems to agree, given that its present version (4.0.2) refits the models using
MLE
in order to compare their deviances. 2. Even when the models have been fit using MLE, likelihood-ratio tests for variance components are only applicable in cases of a single variance component. In your case, this means a LRT can only be used
for *m1
vs ols1* and *m2 vs ols2*. There, you simply divide the p-value reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are obviously extremely statistically significant. However, models *m3
*and
*m4* both have two random effects. The last time I checked, the default assumption of a chi-squared deviance is no longer applicable
in
such cases, so the p-values reported by Stata and SPSS are only
approximate
and tend to be too conservative. Perhaps you might apply an
information
criterion instead, such as the AIC
< https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-m ixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff <https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff> ect>
. Best, J
--
_____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Also see RLRsim, pbkrtest. lmerTest::ranova() is more convenient (and sounds like what you're looking for), but RLRsim and pbkrtest are going to be more accurate for individual comparisons.
On 9/7/20 2:13 AM, Daniel L?decke wrote:
Hi Simon, I'm not sure if this is a useful question. The variance can / should never be negative, and it usually is always above 0 if you have some variation in your outcome depending on the group factors (random effects). Packages I know that do some "significance testing" or uncertainty estimation of random effects are lmerTest::ranova() (quite well documented what it does) or "arm::se.ranef()" resp. "parameters::standard_error(effects = "random")". The two latter packages compute standard errors for the conditional modes of the random effects (what you get with "ranef()"). Best Daniel -----Urspr?ngliche Nachricht----- Von: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> Im Auftrag von Simon Harmel Gesendet: Montag, 7. September 2020 06:28 An: Juho Kristian Ruohonen <juho.kristian.ruohonen at gmail.com> Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org> Betreff: Re: [R-sig-ME] Statistical significance of random-effects (lme4 or others) Dear J, My goal is not to do any comparison between any models. Rather, for each model I want to know if the variance component is different from 0 or not. And what is a p-value for that. On Sun, Sep 6, 2020 at 11:21 PM Juho Kristian Ruohonen < juho.kristian.ruohonen at gmail.com> wrote:
A non-statistician's two cents:
1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for
models fit using REML (rather than MLE). The anova() function seems to
agree, given that its present version (4.0.2) refits the models using
MLE
in order to compare their deviances.
2. Even when the models have been fit using MLE, likelihood-ratio
tests for variance components are only applicable in cases of a single
variance component. In your case, this means a LRT can only be used for
*m1
vs ols1* and *m2 vs ols2*. There, you simply divide the p-value
reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are
obviously extremely statistically significant. However, models *m3 *and
*m4* both have two random effects. The last time I checked, the
default assumption of a chi-squared deviance is no longer applicable in
such cases, so the p-values reported by Stata and SPSS are only
approximate
and tend to be too conservative. Perhaps you might apply an information
criterion instead, such as the AIC
<https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-m ixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff ect>
. Best, J
--
_____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I highly appreciate everyone's valuable input. Thank you very much to you all, Simon
On Mon, Sep 7, 2020 at 10:58 AM Ben Bolker <bbolker at gmail.com> wrote:
Also see RLRsim, pbkrtest. lmerTest::ranova() is more convenient (and sounds like what you're looking for), but RLRsim and pbkrtest are going to be more accurate for individual comparisons. On 9/7/20 2:13 AM, Daniel L?decke wrote:
Hi Simon, I'm not sure if this is a useful question. The variance can / should
never
be negative, and it usually is always above 0 if you have some variation
in
your outcome depending on the group factors (random effects). Packages I know that do some "significance testing" or uncertainty estimation of random effects are lmerTest::ranova() (quite well
documented
what it does) or "arm::se.ranef()" resp.
"parameters::standard_error(effects
= "random")". The two latter packages compute standard errors for the conditional modes of the random effects (what you get with "ranef()"). Best Daniel -----Urspr?ngliche Nachricht----- Von: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> Im Auftrag von Simon Harmel Gesendet: Montag, 7. September 2020 06:28 An: Juho Kristian Ruohonen <juho.kristian.ruohonen at gmail.com> Cc: r-sig-mixed-models <r-sig-mixed-models at r-project.org> Betreff: Re: [R-sig-ME] Statistical significance of random-effects (lme4
or
others) Dear J, My goal is not to do any comparison between any models. Rather, for each model I want to know if the variance component is different from 0 or
not.
And what is a p-value for that. On Sun, Sep 6, 2020 at 11:21 PM Juho Kristian Ruohonen < juho.kristian.ruohonen at gmail.com> wrote:
A non-statistician's two cents:
1. I'm not sure likelihood-ratio tests (LRTs) are valid at all for
models fit using REML (rather than MLE). The anova() function seems
to
agree, given that its present version (4.0.2) refits the models
using
MLE
in order to compare their deviances.
2. Even when the models have been fit using MLE, likelihood-ratio
tests for variance components are only applicable in cases of a
single
variance component. In your case, this means a LRT can only be used
for
*m1
vs ols1* and *m2 vs ols2*. There, you simply divide the p-value
reported by *anova(m1, ols1) *and *anova(m2, ols2)* by two. Both are
obviously extremely statistically significant. However, models *m3
*and
*m4* both have two random effects. The last time I checked, the
default assumption of a chi-squared deviance is no longer
applicable in
such cases, so the p-values reported by Stata and SPSS are only
approximate
and tend to be too conservative. Perhaps you might apply an
information
criterion instead, such as the AIC
<
ixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-eff
ect>
. Best, J
--
_____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen
Rechts; Gerichtsstand: Hamburg | www.uke.de
Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim
Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel
_____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models