lmer() vs. lme() gave different variance component estimates
On 09/18/2010 12:09 PM, Jarrod Hadfield wrote:
Both converge to the same answer on Fedora and Mac (Mac sessionInfo() below)
[snip]
lmer(score~trt+(1|id/eye),dat, verbose=T, start=list(matrix(.81),matrix(.57))) 1: 1304.7519: 2.42844 1.75398 2: 776.66857: 6.30805 0.00000 3: 657.56008: 7.73875 0.00000 4: 514.89384: 10.7783 0.000123881 5: 460.61372: 13.2310 0.00140974 6: 434.88979: 15.6779 0.00908817 7: 426.68762: 17.5388 0.0316599 8: 424.89288: 18.6843 0.0743135 9: 424.71437: 19.1269 0.132248 10: 424.70495: 19.2146 0.203365 11: 424.68492: 19.3175 0.420732
[snip]
l <- lmer(score~trt+(1|id/eye),dat,
start=list(matrix(.81),matrix(.57)), verbose=T) 0: 2226.5319: 0.810000 0.570000 1: 1304.7507: 2.42844 1.75398 2: 776.66839: 6.30805 0.00000 3: 657.56008: 7.73875 0.00000 4: 514.89378: 10.7783 0.00000 5: 460.61371: 13.2310 0.00000 6: 434.88972: 15.6779 1.52026e-08 7: 426.68822: 17.5387 6.57868e-08 8: 424.89428: 18.6843 1.41900e-07 9: 424.71823: 19.1258 2.47916e-07 10: 424.71356: 19.2075 3.75766e-07 11: 424.71354: 19.2122 5.37851e-07 12: 424.71354: 19.2122 7.55161e-07
This would appear to be the crux. Notice how in both cases the 2nd parameter gets thrown to (near-) zero, but on the 64-bit machine, it manages to claw itself back, in 32 bits it is struggling as well, but not fast enough so that convergence is declared prematurely. I'm not quite up to speed on the current implementation of lmer, but I guess that it is still using the log-Cholesky representation, and I suspect that the above is displaying a general weakness of log-parametrizations: By introducing singularities at the boundary of the parameter space it turns an otherwise perfectly well-behaved likelihood into one with extensive very flat regions. In this case, the REML likelihood can be explicitly written as a product of three chi-square terms with a linear parametrization of their scale parameters, profiled over one of the parameters, and with an explicit formula for the maximum to boot. Someone with a large piece of paper and sufficient time on their hand should be able to map the situation out in extensive detail. (I'm also continually fascinated by the fact that so large differences come up between 32-bit and 64-bit platforms and I can't quite escape from the suspicion that somewhere in our code, we have an unintended platform dependence. However, I don't think that is the main point here.)
Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com