Skip to content

How to interpret verbose output

2 messages · Stuart Luppescu, Ben Bolker

#
Hello, I'm trying to do a variance decomposition of teacher performance
ratings. Each teacher is observed four times and rated on 9 components.
The model looks like this:

 lme8 <- lmer(rating ~
              (1|tid.f) + (1|obsorder.f) + (1|comp.f) +
               (1|tid.f:comp.f) +  (1|obsorder.f:comp.f) + (1|
tid.f:obsorder.f) ,
               data=ratings, REML=FALSE, verbose=2)

where tid.f is the teacher identifier, comp.f the component identifier,
and obsorder.f the observation number. 

This works fine for the whole dataset, but I want to do it separately by
decile based on the average rating for each teacher, so I added this: ,
subset=ratings$bins==quantile
where the bins are {1, ..., 10} indicating the decile, and quantile is a
loop index used like this: for(quantile in 1:10) {}

It works fine for the lowest decile, but fails for every decile after
that. I get 0.00 for the tid.f variance component, which is the one I'm
really interested in. I have no idea why. I checked the distribution of
average ratings by decile and it all looks unremarkable. I think there
may be a clue in the iteration details as shown by the verbose output
but I don't know how to interpret it. Here it is for decile 1:

npt = 8 , n =  6 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:  16:      18359.8;0.454554 0.735854 0.376302 0.705437 0.787591
0.802885 
  0.0020:  32:      18346.2;0.454004 0.699893 0.407403 0.574987 0.702683
0.880645 
 0.00020:  77:      18263.7;0.432231 0.697895 0.398860 0.0461275
0.326475  1.55039 
 2.0e-05: 292:      18252.6;0.432051 0.700841 0.394275 0.0472994
0.315551 0.130071 
 2.0e-06: 339:      18252.6;0.432029 0.700839 0.394515 0.0472285
0.314849 0.129962 
 2.0e-07: 364:      18252.6;0.432028 0.700822 0.394548 0.0472181
0.314777 0.129904 
At return
395:     18252.578: 0.432030 0.700822 0.394554 0.0472204 0.314755
0.129913
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: rating ~ (1 | tid.f) + (1 | obsorder.f) + (1 | comp.f) + (1 |  
    tid.f:comp.f) + (1 | obsorder.f:comp.f) + (1 | tid.f:obsorder.f)
   Data: ratings
 Subset: ratings$bins == quantile

     AIC      BIC   logLik deviance df.resid 
 18268.6  18328.1  -9126.3  18252.6    12622 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2200 -0.5592 -0.0306  0.5711  4.4942 

Random effects:
 Groups            Name        Variance Std.Dev.
 tid.f:comp.f      (Intercept) 0.033067 0.18184 
 tid.f:obsorder.f  (Intercept) 0.087013 0.29498 
 tid.f             (Intercept) 0.027579 0.16607 
 obsorder.f:comp.f (Intercept) 0.000395 0.01988 
 comp.f            (Intercept) 0.017551 0.13248 
 obsorder.f        (Intercept) 0.002990 0.05468 
 Residual                      0.177161 0.42090 
Number of obs: 12630, groups: tid.f:comp.f, 3159; tid.f:obsorder.f,
1404; tid.f, 351; obsorder.f:comp.f, 45; comp.f, 9; obsorder.f, 5

Fixed effects:
            Estimate Std. Error t value
(Intercept)  2.01657    0.05355   37.65

And here for decile 5:

npt = 8 , n =  6 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:  18:      15847.4;0.261469 0.514852 0.178718 0.0962381
0.152470 0.160530 
  0.0020:  35:      15749.5;0.356245 0.532056  0.00000 0.0961818
0.219586 0.166497 
 0.00020:  90:      15742.1;0.349036 0.516678  0.00000 0.0755764
0.330633 0.248861 
 2.0e-05: 107:      15742.1;0.348612 0.515454  0.00000 0.0766747
0.332641 0.251693 
 2.0e-06: 250:      15742.1;0.348335 0.515211  0.00000 0.0765898
0.330456 0.260112 
 2.0e-07: 273:      15742.1;0.348339 0.515207  0.00000 0.0765856
0.330510 0.260162 
At return
281:     15742.076: 0.348339 0.515207 1.97296e-07 0.0765858 0.330510
0.260162
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: rating ~ (1 | tid.f) + (1 | obsorder.f) + (1 | comp.f) + (1 |  
    tid.f:comp.f) + (1 | obsorder.f:comp.f) + (1 | tid.f:obsorder.f)
   Data: ratings
 Subset: ratings$bins == quantile

     AIC      BIC   logLik deviance df.resid 
 15758.1  15817.7  -7871.0  15742.1    12772 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.6535 -0.3833  0.1362  0.5225  3.4610 

Random effects:
 Groups            Name        Variance  Std.Dev.
 tid.f:comp.f      (Intercept) 0.0192757 0.13884 
 tid.f:obsorder.f  (Intercept) 0.0421666 0.20535 
 tid.f             (Intercept) 0.0000000 0.00000 
 obsorder.f:comp.f (Intercept) 0.0009318 0.03052 
 comp.f            (Intercept) 0.0173531 0.13173 
 obsorder.f        (Intercept) 0.0107521 0.10369 
 Residual                      0.1588568 0.39857 
Number of obs: 12780, groups: tid.f:comp.f, 3195; tid.f:obsorder.f,
1420; tid.f, 355; obsorder.f:comp.f, 45; comp.f, 9; obsorder.f, 5

Fixed effects:
            Estimate Std. Error t value
(Intercept)  2.85126    0.06774   42.09

For decile 5, I notice that the column fourth from the right goes to 0
after the first line. Does that mean something? Any ideas about why this
is failing?

Thanks in advance.
#
On 14-05-21 06:15 PM, Stuart Luppescu wrote:
This printing comes from minqa::bobyqa.  ?bobyqa says:

iprint The value of ?iprint? should be set to ?0, 1, 2 or 3?,
          which controls the amount of printing. Specifically, there is
          no output if ?iprint=0? and there is output only at the
          return if ?iprint=1?. Otherwise, each new value of ?rho? is
          printed, with the best vector of variables so far and the
          corresponding value of the objective function. Further, each
          new value of the objective function with its variables are
          output if ?iprint=3?.  Default value is ?0?.
Number of points used for quadratic approximation, number of points
(parameters)
rho (current trust region radius): number of function evaluations
(?): current value of objective function (PWRSS/deviance); values of
parameters ("theta") -- lower triangle of the Cholesky factor of the RE
variance-covariance matrices, scaled by the residual standard deviation.
It means the second element of the theta vector (which controls the
among-'tid.f' variation in the intercept) is going to zero.  This is
probably the correct numeric value ...

  I would consider further


(1) scaling and centering continuous predictors to see if that helps
(2) simulating pseudo-data and fitting it to see how it performs
(?simulate.merMod will help with this)
(3) using bbmle::slice2D to view 2D slices of the deviance surface.