lmer vs lmer2 - R-SIG-mixed-models

Thu, Oct 4, 2007 3:41 PM #

On 10/5/07, dave fournier <otter at otter-rsch.com> wrote:

Because the SAS program is fitting a different model?

If you look at the sample SAS programs on the web site for the book
you will see that the authors are fitting models with fixed effects
for the logarithm of the height and the logarithm of the base height.

I have sort of lost track of the discussion of this example but I can
reproduce the results from Garrett Fitzmaurice's SAS analysis of these
data except for the variance-covariance of the random effects in the
model with correlated random effects for the intercept, the age and
the logarithm of the height.  With the development version of the lme4
package I get a (near) singular variance-covariance matrix in that
model fit while SAS PROC MIXED doesn't indicate a problem with the
fit.  The only indication of a problem from SAS is the large standard
errors on the estimates of the variance-covariance parameters.

I enclose the R script and output using the development version of the
lme4 package.  I have copied the variable names, etc. from the SAS
programs on Garrett's web site.  I fit two versions of each model, one
with all the subjects' data (fm1, fm2 and fm3) and one eliminating the
data for subject 197 (fm1a, fm2a and fm3a).  (Dave: according to the
information on Garrett's web site it is subject 197, not 177, who
appears to be an outlier.)

The clue that model fm3a has a singular variance covariance matrix is
the estimated correlation of -1.000.  Also, the verbose output shows
the converged value of the second parameter is very close to zero.
The first three parameters represent the variances of linear
combinations of the random effects.  The interpretation is that a
linear combination of the random effects for the intercept and for age
has zero variance.

The big change in the development version of the lme4 package relative
to earlier versions is a rewriting of the mixed model equations so
that a singular variance-covariance matrix for the random effects is
approached smoothly, even though it is on the boundary.

I have permission from the book's authors to create an R package with
the data sets from the book.  The package will be called AppLong and
will include sample analyses reproducing the SAS analyses as best I
can.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fev1_Rout.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20071004/fd58543a/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fev1_R.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20071004/fd58543a/attachment-0001.txt>

dave fournier

Thu, Oct 4, 2007 10:46 PM #

Hi,

I checked this example out with ADMB-RE using a modification of
our glmmADMB program  and have found the following:

1)

Parameter estimates with ADMB-RE are stable and
I get almost the same ones with or without the group 177 observations.

2) I get almost exactly the same LL estimate as SAS.

3) My estimates  for the fixed effects are similar to those in
    lmer2 except for the Intercept

Here are the estimates for lmer2 without group 177
    Estimate Std. Error t value
(Intercept) -1.948119   0.095877  -20.32
Height       1.640650   0.032800   50.02
Age          0.019379   0.001310   14.79
InitHeight   0.143977   0.111043    1.30
InitAge     -0.014618   0.007501   -1.95

these are the ADMB-RE estimates without group 177
  LL = 2294.85
   real_b           -2.0369e+000 1.0393e-001
   real_b           1.6460e+000 3.4587e-002
   real_b           1.9275e-002 1.3685e-003
   real_b           2.4857e-001 1.1984e-001
   real_b           -2.1290e-002 8.1749e-003

these are the estimates with group 177

   real_b           -2.0353e+000 1.0380e-001
   real_b           1.6438e+000 3.4430e-002
   real_b           1.9337e-002 1.3595e-003
   real_b           2.5070e-001 1.1966e-001
   real_b          -2.1486e-002 8.1618e-003

Here are the lmer2 estimates with group 177 included
(Intercept) -2.048023   0.101413  -20.19
Height       1.643644   0.031106   52.84
Age          0.019092   0.001391   13.73
InitHeight   0.262909   0.118516    2.22
InitAge     -0.021540   0.008111   -2.66

I think it is highly unlikely that the lmer2 estimate of
-1.948119 is the "correct" one and changes so much with
the addition of these few observations, while just by chance
ADMB-RE is wrong but happens to get the same estimate
for Intercept with and without group 177.
So it appears that lmer2 is not trustworthy.

Does anyone understand why the SAS point estimates appear to be 
completely different?

     Cheers,

       Dave



David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-

David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

dave fournier

Fri, Oct 5, 2007 6:48 AM #

Thanks for that Doug, and I apologize for my bad eyesight.
I really can't see the screen in my old age!

It was unfortunate that when I removed the wrong
observations from the data the LL turned out to be
almost identical to the one from the SAS analysis.

Doing it properly, when I remove  the observations for group 197 from
the analysis I obtain the estimates

   real_b           -1.9486e+00 9.5787e-02
   real_b            1.6408e+00 3.3554e-02
   real_b            1.9368e-02 1.3501e-03
   real_b            1.4427e-01 1.1077e-01
   real_b           -1.4614e-02 7.4902e-03

which are identical  to lmer2
for all practical purposes.

  (Intercept) -1.948119   0.095877  -20.32
 > Height       1.640650   0.032800   50.02
 > Age          0.019379   0.001310   14.79
 > InitHeight   0.143977   0.111043    1.30
 > InitAge     -0.014618   0.007501   -1.95

However what I was  interested in was the application
of slightly robust methods in NLMM (Once you go robust
they are nonlinear even if the originalmodel is linear.)
So I fit the entire data set using a
conservative robust likelihood,
a 95% 05% mixture of two normal with the 05% one
having 3 times the std dev. of the 95% one The estimates I obtained
are

  real_b           -1.9730e+000 9.7074e-002
  real_b           1.6160e+000 2.7502e-002
  real_b           1.9959e-002 1.2192e-003
  real_b           2.1801e-001 1.1086e-001
  real_b           -1.9375e-002 7.5518e-003

compared to the non robust fit to all the data of

  real_b           -2.0353e+000 1.0380e-001
  real_b           1.6438e+000 3.4430e-002
  real_b           1.9337e-002 1.3595e-003
  real_b           2.5070e-001 1.1966e-001
  real_b          -2.1486e-002 8.1618e-003

which is not bad when one does not have to physically remove the
"bad" data. So what I really wanted to argue is that one should 
routinely use conservative robust methods when fitting RE models and in 
passing point out that ADMB-Re privdes a good platform for doing this.

    Cheers,

     Dave

David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com