Back to formatted view
Raw Message

Message-ID: <1362089238.14071.9.camel@musuko.uchicago.edu>
Date: 2013-02-28T22:07:18Z
From: Stuart Luppescu
Subject: Data frame size limits in MCMCglmm?
In-Reply-To: <20130125103605.16911d1gwewajp5w@www.staffmail.ed.ac.uk>

On Fri, 2013-01-25 at 10:36 +0000, Jarrod Hadfield wrote:
> Hi Stuart,
> 
> 2.4 million records is bigger than anything I've tried but in theory  
> it should run, or return an error if it can't allocate enough memory.
> It definitely shouldn't be seg-faulting.  If you could send a  
> reproducible example (preferably one where it fails quickly) I will  
> take a look into it.

I finally got around to doing this analysis on a 25% random sample. It
ran but took about 25 hours for 100,000 iterations. (Was that too many?)

Here are the results:

 Iterations = 3001:99991
 Thinning interval  = 10
 Sample size  = 9700 

 DIC: 1739944 

 G-structure:  ~tid

    post.mean l-95% CI u-95% CI eff.samp
tid    0.4597   0.4426   0.4754     7732

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: final.points ~ gr10 + gr11 + gr12 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)    1.0179   1.0007   1.0347     6334 <1e-04 ***
gr10           0.3155   0.3033   0.3278     7514 <1e-04 ***
gr11           0.5825   0.5686   0.5959     7728 <1e-04 ***
gr12           0.7262   0.7121   0.7412     7390 <1e-04 ***
---
Signif. codes:  0 ????**??? 0.001 ????*??? 0.01 ??????? 0.05 ??????? 0.1
??? ??? 1 

 Cutpoints: 
                             post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitfinal.points.1    0.9506   0.9459   0.9552     1458
cutpoint.traitfinal.points.2    1.9154   1.9097   1.9216     1092
cutpoint.traitfinal.points.3    2.9882   2.9807   2.9956     1096


The main reason I'm doing this analysis is to see if the results are
different with ordered category outcomes as opposed to treating the
outcome as numbers (which I've done with lmer). Does the fact that the
posterior means for the cutpoints are very close to the numerical values
mean that I am not gaining much by treating outcome as ordered
categories (and I can just use the results from lmer)?

Thanks.



-- 
Stuart Luppescu <slu at ccsr.uchicago.edu>
University of Chicago