effective sample size in MCMCglmm
Hello again,
Thank you so much, I will change the code and specify pl=FALSE!
I ran the model based on Matthew's suggestion and I got the output below. What effective sample should I aim for in regards to the G structure (studyid) - should I try again tweaking the specifications based on higher burnins until I reach an effective sample of 100 for studyid?
On a separate note, I managed to plot the model, but it looks illegible, the graphs are so small given the large number of groups. Should I plot just a random subsample then?
Thank you all so much,
D
Iterations = 3001:102901
Thinning interval = 100
Sample size = 1000
DIC: 2928.214
G-structure: ~studyid
post.mean l-95% CI u-95% CI eff.samp
studyid 0.06139 5.424e-16 0.4621 21.16
~class
post.mean l-95% CI u-95% CI eff.samp
class 0.7278 0.132 1.232 154.9
~idv(l_lfvcspn)
post.mean l-95% CI u-95% CI eff.samp
l_lfvcspn. 752.6 50.16 2352 1000
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 2.533 1.907 3.204 186.5
Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -7.033381 -7.535696 -6.524339 487.7 <0.001 ***
f_newage_c 0.009066 -0.025116 0.041159 909.3 0.596
x2nM -0.107655 -0.403837 0.165664 1000.0 0.440
x8n1 0.413878 0.119578 0.745171 891.1 0.002 **
x9n1 -0.287382 -0.600003 0.016531 892.7 0.070 .
x5n -0.001037 -0.006182 0.004417 864.8 0.670
l_lfvcspo 0.397952 -0.941309 1.455719 1000.0 0.426
x3n 0.079109 -0.001274 0.150664 914.1 0.042 *
x4n 0.058303 -0.067089 0.174404 1000.0 0.344
x6n -0.013916 -0.067176 0.049719 1000.0 0.610
x7n -0.014880 -0.081243 0.057582 710.6 0.654
offset 0.999452 0.977609 1.018083 1000.0 <0.001 ***
<http://aka.ms/weboutlook>
From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
Sent: Tuesday, October 10, 2017 12:29 PM
To: dani; Matthew; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] effective sample size in MCMCglmm
Sent: Tuesday, October 10, 2017 12:29 PM
To: dani; Matthew; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] effective sample size in MCMCglmm
Hi, You probably want to have pl=FALSE too, unless you have a special reason to save the latent variables? Currently this saves 20,000*7,000 = 140,000,000 numbers, which will take about 1GB of memory. Cheers, Jarrod On 10/10/2017 19:21, dani wrote: > Hello Matthew, > > > Thank you so much for such a thorough answer! This is so helpful! I truly appreciate it! > > Best regards, > > D > > ________________________________ > From: Matthew <mew0099 at auburn.edu> > Sent: Tuesday, October 10, 2017 11:17 AM > To: dani; r-sig-mixed-models at r-project.org > Subject: Re: [R-sig-ME] effective sample size in MCMCglmm > > Hi Dani, > > You might want to have a good read through the extensive Course Notes > that Jarrod Hadfield has written to accompany the MCMCglmm package. > Particularly (but not exclusively), Sections 1.3.1 and all of 1.4 > pertain to these issues. > > Your specifications of `nitt`, `thin`, and `burnin` are such that you > have retained almost 20,000 samples so it is not surprising your > computer is having issues. However, the effective sample sizes being so > low means that the almost 20,000 samples you have retained are not > independent. > > What you really need to change is the thinning interval. Given how bad > your effective sample sizes are compared to the total number of samples > retained, I would start with `thin=100` and see if that reduces the > autocorrelation between successive samples. You can check the > autocorrelation with `autocorr.diag(m3.new$Sol[, 1:k])` and > `autocorr.diag(m3.new$VCV)`, for the location effects and variance > components, respectively. > > You will need to figure out the best `burnin`, but start with > `burnin=3000` and increase if the traceplots show a pattern at the > beginning of the trace. > > Remember to adjust `nitt` to run the MCMC long enough to get the desired > number of samples, but not excessively longer. So nitt = burnin + > thin*(number of samples to keep). All of this will result in a suggested > model specification like the following, but this will likely need to be > changed once you diagnose the performance with your actual data: > > nsamp <- 1000 > THIN <- 100 > BURNIN <- 3000 > NITT <- BURNIN + THIN*nsamp > m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset, > random =~ studyid+class+idv(l_lfvcspn), > data = wo1, > family = "poisson", prior=prior2, > verbose=FALSE, > thin = THIN, #<-- CHANGED > burnin = BURNIN, #<-- CHANGED > nitt = NITT, #<-- CHANGED > saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE) > autocorr.diag(m3new$Sol[, 1:k]) > autocorr.diag(m3new$VCV) > > With regards to your other message concerning the trace plots, this is > likely because you have so many samples (almost 20,000). Once you have > changed the `nitt` to reflect the `thin` and `burnin` required to give > you the least autocorrelation and effective samples close to the number > of samples you want, then you should be able to save the model and run > the MCMC diagnostics much more easily. > > Sincerely, > Matthew > > > > On 10/10/2017 12:44 PM, dani wrote: >> Hello everyone, >> >> >> My question is: >> >> do the effective samples I obtain in my MCMCglmm output (attached below) make sense? >> >> >> I understand that the rule of thumb is to get effective samples of at least 100-1000. How should I tweak the thin, burnin, and the nitt specifications? My computer reaches its memory limit fast and I have barely been able to run the model below. >> >> >> I have the following model: >> >> k<-12 # number of fixed effects >> >> prior2<-list(B=list(V=diag(k)*1e4, mu=rep(0,k)), >> R=list(V=1, nu=0), >> G=list(G1=list(V=1, nu=0), >> G2=list(V=1, nu=0), >> G3=list(V=1, nu=0))) >> >> prior2$B$mu[k]<-1 >> prior2$B$V[k,k]<-1e-4 >> >> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset, >> random =~ studyid+class+idv(l_lfvcspn), >> data = wo1, >> family = "poisson", prior=prior2, >> verbose=FALSE, >> thin = 10, >> burnin = 2000, >> nitt = 200000, >> saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE) >> >> >> Iterations = 2001:199991 >> Thinning interval = 10 >> Sample size = 19800 >> DIC: 2930.006 >> >> G-structure: >> ~studyid >> post.mean l-95% CI u-95% CI eff.samp >> studyid 0.1053 1.814e-11 0.5757 81.12 >> ~class >> post.mean l-95% CI u-95% CI eff.samp >> class 0.7008 0.07577 1.207 382.1 >> >> ~idv(l_lfvcspn) >> post.mean l-95% CI u-95% CI eff.samp >> l_lfvcspn. 705.7 37.33 2044 11852 >> >> R-structure: >> ~units >> post.mean l-95% CI u-95% CI eff.samp >> units 2.516 1.809 3.23 336.6 >> >> Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset >> post.mean l-95% CI u-95% CI eff.samp pMCMC >> (Intercept) -7.0395427 -7.5590206 -6.5187847 865.6 <5e-05 *** >> f_newage_c 0.0099703 -0.0222981 0.0448880 3324.7 0.5615 >> x2nM -0.1068377 -0.3782251 0.1678528 3760.2 0.4462 >> x8n1 0.4103047 0.0920875 0.7179638 3884.3 0.0099 ** >> x9n1 -0.2784715 -0.5975232 0.0495615 3337.9 0.0901 . >> x5n -0.0009378 -0.0064175 0.0044266 3528.4 0.7283 >> l_lfvcspo 0.4018810 -0.8271349 1.4468375 14536.6 0.4080 >> x3n 0.0789652 -0.0018683 0.1523108 3726.0 0.0438 * >> x4n 0.0602655 -0.0643903 0.1859443 2711.9 0.3356 >> x6n -0.0137132 -0.0728385 0.0449804 3489.7 0.6453 >> >> Thank you all so much, >> Dani >> >> >> <http://aka.ms/weboutlook> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> R-sig-mixed-models at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models > R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models> > stat.ethz.ch > Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ... > > > > > -- > > > **************************************************** > Matthew E. Wolak, Ph.D. > Assistant Professor > Department of Biological Sciences > Auburn University > 306 Funchess Hall > Auburn, AL 36849, USA > Email: matthew.wolak at auburn.edu > Tel: 334-844-9242 > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-mixed-models at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.