Dear All,
For my reproducible model below, SPSS gives the variance component of
119.95 for Y1, and 127.90 for Y2.
But in `nlme::lme()` my variance components are 105.78 for Y1 and 113.73
for Y2.
Can we make the `lme()` reproduce the SPSS's variance components?
#======= Data and R code:
dat <- read.csv('https://raw.githubusercontent.com/hkil/m/master/mv.l.csv')
library(nlme)
m2 <- lme(value ~0 + name, random = ~0 + name| Student, data = dat, method
= "ML")
Random effects variance covariance matrix
nameY1 nameY2
nameY1 105.780 60.869
nameY2 60.869 113.730
Random effects variances in R and SPSS not matching
6 messages · Simon Harmel, Phillip Alday, Ben Bolker
Without more information, we don't know for sure that the models are the same in both languages. It's too much of a time sink for a human to change model details randomly until the output matches some expected output, but you could probably do something with genetic programming or simulated annealing to do that.... But if you can get more information, I would start by making sure - that the contrasts are truly the same - assumed covariance structures are the same - that one language isn't dropping some observations that the other is keeping (check the reporting number of observations levels of the grouping var) - the estimation method is the same across languages (ML,REML; hopefully SPSS isn't using something like quasi-likelihood) - different optimizers (if available) give the same result across languages (i.e. make sure you're not in a local optimum) - cross checking the result against yet another software package For example, cross-checking against lme4 immediately hints that this model might not be advisable / have a well-defined optimum:
m2.4 <- lmer(value ~0 + name + (0 + name| Student), data = dat,
REML=FALSE) Error: number of observations (=1600) <= number of random effects (=1600) for term (0 + name | Student); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable Phillip
On 31/3/21 10:15 pm, Simon Harmel wrote:
Dear All,
For my reproducible model below, SPSS gives the variance component of
119.95 for Y1, and 127.90 for Y2.
But in `nlme::lme()` my variance components are 105.78 for Y1 and 113.73
for Y2.
Can we make the `lme()` reproduce the SPSS's variance components?
#======= Data and R code:
dat <- read.csv('https://raw.githubusercontent.com/hkil/m/master/mv.l.csv')
library(nlme)
m2 <- lme(value ~0 + name, random = ~0 + name| Student, data = dat, method
= "ML")
Random effects variance covariance matrix
nameY1 nameY2
nameY1 105.780 60.869
nameY2 60.869 113.730
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Thank you. I'll be happy to give more info. SPSS model syntax is shown on in Table 14.5, p. 585 (type `606` in page slot) of this book ( http://docshare02.docshare.tips/files/31719/317194846.pdf). The SPSS output is shown on p. 588 (type `606` in page slot). I should add the covariance between `Y1` and `Y2` exactly match. and the log-likelihood seems to be almost identical. But variances differ by a lot. SPSS is using "ML". Please let me know if I can provide any further information. Thank you for your prompt reply, Simon
On Wed, Mar 31, 2021 at 3:36 PM Phillip Alday <me at phillipalday.com> wrote:
Without more information, we don't know for sure that the models are the same in both languages. It's too much of a time sink for a human to change model details randomly until the output matches some expected output, but you could probably do something with genetic programming or simulated annealing to do that.... But if you can get more information, I would start by making sure - that the contrasts are truly the same - assumed covariance structures are the same - that one language isn't dropping some observations that the other is keeping (check the reporting number of observations levels of the grouping var) - the estimation method is the same across languages (ML,REML; hopefully SPSS isn't using something like quasi-likelihood) - different optimizers (if available) give the same result across languages (i.e. make sure you're not in a local optimum) - cross checking the result against yet another software package For example, cross-checking against lme4 immediately hints that this model might not be advisable / have a well-defined optimum:
m2.4 <- lmer(value ~0 + name + (0 + name| Student), data = dat,
REML=FALSE) Error: number of observations (=1600) <= number of random effects (=1600) for term (0 + name | Student); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable Phillip On 31/3/21 10:15 pm, Simon Harmel wrote:
Dear All,
For my reproducible model below, SPSS gives the variance component of
119.95 for Y1, and 127.90 for Y2.
But in `nlme::lme()` my variance components are 105.78 for Y1 and 113.73
for Y2.
Can we make the `lme()` reproduce the SPSS's variance components?
#======= Data and R code:
dat <- read.csv('
library(nlme) m2 <- lme(value ~0 + name, random = ~0 + name| Student, data = dat,
method
= "ML")
Random effects variance covariance matrix
nameY1 nameY2
nameY1 105.780 60.869
nameY2 60.869 113.730
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
ps. I should also add I tried to use manually created dummies (D1, D2
equivalent to SPSS's `index 1` and `index 2` dummies) to mimic SPSS syntax
but that didn't (as I expected) change anything.
lme(value ~ 0 + D1 + D2, random = ~0 + D1 + D2 | Student, data = dat,
method = "ML")
On Wed, Mar 31, 2021 at 3:43 PM Simon Harmel <sim.harmel at gmail.com> wrote:
Thank you. I'll be happy to give more info. SPSS model syntax is shown on in Table 14.5, p. 585 (type `606` in page slot) of this book ( http://docshare02.docshare.tips/files/31719/317194846.pdf). The SPSS output is shown on p. 588 (type `606` in page slot). I should add the covariance between `Y1` and `Y2` exactly match. and the log-likelihood seems to be almost identical. But variances differ by a lot. SPSS is using "ML". Please let me know if I can provide any further information. Thank you for your prompt reply, Simon On Wed, Mar 31, 2021 at 3:36 PM Phillip Alday <me at phillipalday.com> wrote:
Without more information, we don't know for sure that the models are the same in both languages. It's too much of a time sink for a human to change model details randomly until the output matches some expected output, but you could probably do something with genetic programming or simulated annealing to do that.... But if you can get more information, I would start by making sure - that the contrasts are truly the same - assumed covariance structures are the same - that one language isn't dropping some observations that the other is keeping (check the reporting number of observations levels of the grouping var) - the estimation method is the same across languages (ML,REML; hopefully SPSS isn't using something like quasi-likelihood) - different optimizers (if available) give the same result across languages (i.e. make sure you're not in a local optimum) - cross checking the result against yet another software package For example, cross-checking against lme4 immediately hints that this model might not be advisable / have a well-defined optimum:
m2.4 <- lmer(value ~0 + name + (0 + name| Student), data = dat,
REML=FALSE) Error: number of observations (=1600) <= number of random effects (=1600) for term (0 + name | Student); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable Phillip On 31/3/21 10:15 pm, Simon Harmel wrote:
Dear All,
For my reproducible model below, SPSS gives the variance component of
119.95 for Y1, and 127.90 for Y2.
But in `nlme::lme()` my variance components are 105.78 for Y1 and 113.73
for Y2.
Can we make the `lme()` reproduce the SPSS's variance components?
#======= Data and R code:
dat <- read.csv('
library(nlme) m2 <- lme(value ~0 + name, random = ~0 + name| Student, data = dat,
method
= "ML")
Random effects variance covariance matrix
nameY1 nameY2
nameY1 105.780 60.869
nameY2 60.869 113.730
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On 3/31/21 4:36 PM, Phillip Alday wrote:
Without more information, we don't know for sure that the models are the same in both languages. It's too much of a time sink for a human to change model details randomly until the output matches some expected output, but you could probably do something with genetic programming or simulated annealing to do that.... But if you can get more information, I would start by making sure - that the contrasts are truly the same - assumed covariance structures are the same - that one language isn't dropping some observations that the other is keeping (check the reporting number of observations levels of the grouping var) - the estimation method is the same across languages (ML,REML; hopefully SPSS isn't using something like quasi-likelihood) - different optimizers (if available) give the same result across languages (i.e. make sure you're not in a local optimum) - cross checking the result against yet another software package For example, cross-checking against lme4 immediately hints that this model might not be advisable / have a well-defined optimum:
m2.4 <- lmer(value ~0 + name + (0 + name| Student), data = dat,
REML=FALSE) Error: number of observations (=1600) <= number of random effects (=1600) for term (0 + name | Student); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable
If you only have one observation per name per student, then this model will be overfitted. You could use glmmTMB with dispformula = ~0, or blme::blmer with a tight/small prior on the residual variance Based on this https://journal.r-project.org/archive/2017/RJ-2017-010/RJ-2017-010.pdf it looks like you can fix residual variance to a small value (not exactly zero!) in lme, e.g. control = lmerControl(sigma=1e-6) After experimenting with your example below, I can't set sigma to a value < 1 without a false convergence error. Setting it to 1 gives values much closer to your report (118.9, 126.9). Using opt="optim" in the lmerControl and sigma=1e-5 gives the results you're looking for; so does using returnObject=TRUE (which tells lme to give you the answer even though there was a convergence warning). There's also probably a way to rearrange this with a simpler random effect, but I haven't quite figured it out yet.
Phillip On 31/3/21 10:15 pm, Simon Harmel wrote:
Dear All,
For my reproducible model below, SPSS gives the variance component of
119.95 for Y1, and 127.90 for Y2.
But in `nlme::lme()` my variance components are 105.78 for Y1 and 113.73
for Y2.
Can we make the `lme()` reproduce the SPSS's variance components?
#======= Data and R code:
dat <- read.csv('https://raw.githubusercontent.com/hkil/m/master/mv.l.csv')
library(nlme)
m2 <- lme(value ~0 + name, random = ~0 + name| Student, data = dat, method
= "ML")
Random effects variance covariance matrix
nameY1 nameY2
nameY1 105.780 60.869
nameY2 60.869 113.730
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I don't think the optimum is well defined:
library("lattice")
library("lme4")
m2.4 <- lmer(value ~0 + name + (0+name| Student), data = dat,
REML=FALSE, control=lmerControl(optimizer="bobyqa",check.nobs.vs.nRE="warning")) Warning messages: 1: number of observations (=1600) <= number of random effects (=1600) for term (0 + name | Student); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: large eigenvalue ratio - Rescale variables?
p <- profile(m2.4)
There were 50 or more warnings (use warnings() to see the first 50)
xyplot(p)
Those are some very bad profiles! Also, this goes back to the lme4 safety-check that I had to disable. The fundamental problem is that there isn't enough data to completely distinguish the residual variance from the RE, so you get difference answers for the RE variance depending on how much you attribute to the residual variance. I also tried to do this with MCMC and flat priors (always a bad idea, but...) and also ran into bad convergence issues. Phillip
On 31/3/21 10:43 pm, Simon Harmel wrote:
Thank you. I'll be happy to give more info.?SPSS model syntax is shown on in Table 14.5, p. 585 (type `606` in page slot) of this book (http://docshare02.docshare.tips/files/31719/317194846.pdf). The SPSS output is shown on p. 588 (type `606` in page slot). I should add the covariance between `Y1` and `Y2` exactly match. and the log-likelihood seems to be almost identical. But variances differ by a lot. SPSS is using "ML". Please let me know if I can provide any further information. Thank you for your prompt reply, Simon On Wed, Mar 31, 2021 at 3:36 PM Phillip Alday <me at phillipalday.com <mailto:me at phillipalday.com>> wrote: Without more information, we don't know for sure that the models are the same in both languages. It's too much of a time sink for a human to change model details randomly until the output matches some expected output, but you could probably do something with genetic programming or simulated annealing to do that.... But if you can get more information, I would start by making sure - that the contrasts are truly the same - assumed covariance structures are the same - that one language isn't dropping some observations that the other is keeping (check the reporting number of observations levels of the grouping var) - the estimation method is the same across languages (ML,REML; hopefully SPSS isn't using something like quasi-likelihood) - different optimizers (if available) give the same? result across languages (i.e. make sure you're not in a local optimum) - cross checking the result against yet another software package For example, cross-checking against lme4 immediately hints that this model might not be advisable / have a well-defined optimum:
> m2.4 <- lmer(value ~0 + name + (0 + name| Student), data = dat,
REML=FALSE)
Error: number of observations (=1600) <= number of random effects
(=1600) for term (0 + name | Student); the random-effects parameters and
the residual variance (or scale parameter) are probably unidentifiable
Phillip
On 31/3/21 10:15 pm, Simon Harmel wrote:
> Dear All,
>
> For my reproducible model below, SPSS gives the variance component of
> 119.95 for Y1, and 127.90 for Y2.
>
> But in `nlme::lme()` my variance components are 105.78 for Y1 and
113.73
> for Y2.
>
> Can we make the `lme()` reproduce the SPSS's variance components?
>
> #======= Data and R code:
> dat <-
>
> library(nlme)
>
> m2 <- lme(value ~0 + name, random = ~0 + name| Student, data =
dat, method
> = "ML")
>
> Random effects variance covariance matrix
>? ? ? ? ? ? ? nameY1? ?nameY2
> nameY1 105.780? 60.869
> nameY2? 60.869 113.730
>
>? ? ? ?[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list