Skip to content

Unacceptibly high autocorrelation in MCMCglmm

3 messages · Stuart Luppescu, Jarrod Hadfield

#
Hello, I'm running this ordered category outcome model:

glme5.very.len <- MCMCglmm(very.len.summative.o ~ 1 ,
                   prior=list(R=list(V=1, fix=1), G=list(G1=list(V=1,
nu=0), G2=list(V=1, nu=0), G3=list(V=1, nu=0), G4=list(V=1, nu=0) )),
                   random = ~emplid + deptid + grade.f + subject.f ,
                   family = "ordinal",
                   nitt=300000,
                   data = summative.ratings.prin.yr1.full)

I ran it first with nitt=100000 but had very high autocorrelations and
non-sensical variance components and fixed effects, so I increased nitt
to 200000 and then to 300000 but got no change. Here's the summary
output:

 summary(glme5.very.len)

 Iterations = 3001:299991
 Thinning interval  = 10
 Sample size  = 29700 

 DIC: -13239.32 

 G-structure:  ~emplid

       post.mean  l-95% CI u-95% CI eff.samp
emplid     405.3 1.493e-11     1106    7.909

               ~deptid

       post.mean  l-95% CI u-95% CI eff.samp
deptid     131.8 1.118e-16    475.2    42.65

               ~grade.f

        post.mean  l-95% CI u-95% CI eff.samp
grade.f    0.9143 1.405e-17    1.575    15784

               ~subject.f

          post.mean  l-95% CI u-95% CI eff.samp
subject.f     1.633 1.951e-17    2.748    10101

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: very.len.summative.o ~ 1 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)    29.007    2.091   54.969    2.381 <3e-05 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

 Cutpoints: 
                                     post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitvery.len.summative.o.1     14.06   0.8102    27.38    9.382
cutpoint.traitvery.len.summative.o.2     40.34   2.9611    76.04    2.694

Here are some of the autocorrs:

 autocorr(glme5.very.len$VCV)
, , emplid

           emplid    deptid    grade.f  subject.f units
Lag 0   1.0000000 0.5860851 0.04668197 0.06081864   NaN
Lag 10  0.9514313 0.6132116 0.04345287 0.05652945   NaN
Lag 50  0.9459831 0.6259477 0.04881253 0.06093640   NaN
Lag 100 0.9433509 0.6282599 0.04492884 0.06037288   NaN
Lag 500 0.9267886 0.6373151 0.03873992 0.05371885   NaN

, , deptid

           emplid    deptid    grade.f  subject.f units
Lag 0   0.5860851 1.0000000 0.03070680 0.03453008   NaN
Lag 10  0.6137187 0.7579551 0.03233992 0.04139315   NaN
Lag 50  0.6255810 0.7169468 0.02903334 0.03960446   NaN
Lag 100 0.6269979 0.7029498 0.03244468 0.04857241   NaN
Lag 500 0.6322900 0.6650247 0.04049514 0.04306019   NaN

Is there a problem in my data or in the model?

Thank you.
#
HI,

It looks like the probit has underflowed/overflowed - you can check  
this by saving the latent variables and looking to see whether the  
range of the absolute values exceeds 7 (See Section 8.08 of  
CourseNotes).

This can happen with weak priors and (near) complete separation and/or  
with weak priors for effects that are heavily confounded.

I'm not sure how to proceed with underflow/overflow problems  
generally.  I could terminate the procedure, or I could truncate the  
latent variables at their overflow/underflow points. The latter is  
used by some WinBUGS users, but then WinBUGS handles the fact that the  
response is from a truncated normal not a normal - something which  
would be hard to program in MCMCglmm. Any thoughts would be useful.

Cheers,

Jarrod



Quoting Stuart Luppescu <slu at ccsr.uchicago.edu> on Fri, 16 Mar 2012  
17:38:15 -0500:

  
    
2 days later
#
On Sat, 2012-03-17 at 10:29 +0000, Jarrod Hadfield wrote:
Hi Jarrod, I think I've figured out why this is not working. I hope you
or someone can suggest a fix.

I am analyzing ratings data of observations of teacher performance.
Teachers are rated on more than one occasion on a 1-4 scale on 10
components. The object is to calculate the ICC as a measure of
interrater reliablility (the percent of total variance attributed to
differences in teacher performance = variance in emplid/total variance).
This analysis worked perfectly
fine using MCMCglmm with the 10 components as fixed effects.

What I'm doing now (which is NOT working) is calculating one single
summative rating per teacher based on combinations of all the component
ratings a teacher received in a year. That means only one datum per
teacher per year: no separate components and no multiple observations.
So, including the teacher ID (emplid) as a random effect will screw
things up because there is only one datum per teacher and no
within-teacher variance. 

Do you have any idea how to get around this problem?

Thank you very much for your help.