Skip to content

Random effects variances in R and SPSS not matching

6 messages · Simon Harmel, Phillip Alday, Ben Bolker

#
Dear All,

For my reproducible model below, SPSS gives the variance component of
119.95 for Y1, and 127.90 for Y2.

But in `nlme::lme()` my variance components are 105.78 for Y1 and 113.73
for Y2.

Can we make the `lme()` reproduce the SPSS's variance components?

#======= Data and R code:
dat <- read.csv('https://raw.githubusercontent.com/hkil/m/master/mv.l.csv')

library(nlme)

m2 <- lme(value ~0 + name, random = ~0 + name| Student, data = dat, method
= "ML")

Random effects variance covariance matrix
             nameY1   nameY2
nameY1 105.780  60.869
nameY2  60.869 113.730
#
Without more information, we don't know for sure that the models are the
same in both languages.

It's too much of a time sink for a human to change model details
randomly until the output matches some expected output, but you could
probably do something with genetic programming or simulated annealing to
do that....

But if you can get more information, I would start by making sure
- that the contrasts are truly the same
- assumed covariance structures are the same
- that one language isn't dropping some observations that the other is
keeping (check the reporting number of observations levels of the
grouping var)
- the estimation method is the same across languages (ML,REML; hopefully
SPSS isn't using something like quasi-likelihood)
- different optimizers (if available) give the same  result across
languages (i.e. make sure you're not in a local optimum)
- cross checking the result against yet another software package

For example, cross-checking against lme4 immediately hints that this
model might not be advisable / have a well-defined optimum:
REML=FALSE)
Error: number of observations (=1600) <= number of random effects
(=1600) for term (0 + name | Student); the random-effects parameters and
the residual variance (or scale parameter) are probably unidentifiable

Phillip
On 31/3/21 10:15 pm, Simon Harmel wrote:
#
Thank you. I'll be happy to give more info. SPSS model syntax is shown on
in Table 14.5, p. 585 (type `606` in page slot) of this book (
http://docshare02.docshare.tips/files/31719/317194846.pdf).

The SPSS output is shown on p. 588 (type `606` in page slot).

I should add the covariance between `Y1` and `Y2` exactly match. and the
log-likelihood seems to be almost identical. But variances differ by a lot.
SPSS is using "ML".

Please let me know if I can provide any further information.

Thank you for your prompt reply,
Simon
On Wed, Mar 31, 2021 at 3:36 PM Phillip Alday <me at phillipalday.com> wrote:

            

  
  
#
ps. I should also add I tried to use manually created dummies (D1, D2
equivalent to SPSS's `index 1` and `index 2` dummies) to mimic SPSS syntax
but that didn't (as I expected) change anything.

lme(value ~ 0 + D1 + D2, random = ~0 + D1 + D2 | Student, data = dat,
           method = "ML")
On Wed, Mar 31, 2021 at 3:43 PM Simon Harmel <sim.harmel at gmail.com> wrote:

            

  
  
#
On 3/31/21 4:36 PM, Phillip Alday wrote:
If you only have one observation per name per student, then this 
model will be overfitted.  You could use glmmTMB with dispformula = ~0, 
or blme::blmer with a tight/small prior on the residual variance

Based on this

https://journal.r-project.org/archive/2017/RJ-2017-010/RJ-2017-010.pdf

it looks like you can fix residual variance to  a small value (not 
exactly zero!) in lme, e.g.

control = lmerControl(sigma=1e-6)

   After experimenting with your example below, I can't set sigma to a 
value < 1 without a false convergence error.  Setting it to 1 gives 
values much closer to your report (118.9, 126.9).  Using opt="optim" in 
the lmerControl and sigma=1e-5 gives the results you're looking for; so 
does using returnObject=TRUE (which tells lme to give you the answer 
even though there was a convergence warning).

  There's also probably a way to rearrange this with a simpler random 
effect, but I haven't quite figured it out yet.
#
I don't think the optimum is well defined:
REML=FALSE,
control=lmerControl(optimizer="bobyqa",check.nobs.vs.nRE="warning"))
Warning messages:
1: number of observations (=1600) <= number of random effects (=1600)
for term (0 + name | Student); the random-effects parameters and the
residual variance (or scale parameter) are probably unidentifiable
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
There were 50 or more warnings (use warnings() to see the first 50)
Those are some very bad profiles! Also, this goes back to the lme4
safety-check that I had to disable. The fundamental problem is that
there isn't enough data to completely distinguish the residual variance
from the RE, so you get difference answers for the RE variance depending
on how much you attribute to the residual variance.

I also tried to do this with MCMC and flat priors (always a bad idea,
but...) and also ran into bad convergence issues.

Phillip
On 31/3/21 10:43 pm, Simon Harmel wrote: