Dear list, I am hoping to generate predictions (with CIs) based on new data for a model with 3 Gaussian response variables and 1 blocking variable. A very similar and recent question hasn't received a substantive reply yet, here: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2015q2/023616.html . Jarrod mentioned on this list some months ago that a version of MCMCglmm that can predict with new data is imminent, but I haven't seen any signs of it - https://stat.ethz.ch/pipermail/r-sig-mixed-models/2015q1/023329.html . I believe I understand the process as discussed here - https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q1/020065.html http://stackoverflow.com/questions/21057548/how-can-one-simulate-quantities-of-interest-from-the-posterior-density-in-mcmcgl , http://glmm.wikidot.com/faq Here is my working understanding of the process: (1) Generate new data / prediction frame -> pred_frame (2) Generate fixed effects design matrix -> fix_dm (3) Extract chains of coefficients from Sol -> beta (4) Matrix-multiply fix_dm * beta -> temp (5) Extract variance-covariance matrix as rowsums of mod$VCV -> V (6) Divide temp by sqrt(V) to get a matrix of predictions [i, j] with i = MCMC sample and j= new data point -> pred_mat (7) Generate CIs for predictions as HPDs of columns of pred_mat Here are my questions: a. Most broadly - am I on the right track, or have I totally misunderstood something critical? b. For (4) above - the structure of model$Sol in the multi-response model is complex and doesn't conform to the design matrix. I figure I need to melt the design matrix but I'm unsure how to ensure I'm doing that in just the same way as MCMCglmm. c. Some of the same unanswered questions discussed here: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2015q2/023616.html , - I suspect based on some of the discussions linked to above that the procedure I outlined produces results that are marginalized over random effects, but I'm not at all sure. - What's the most appropriate way to generate CIs for predictions? If I'm not wrong about (7), what's the best way to calculate those HPDs? Here is the model I'm working with: prior2 <- list(R=list(V=diag(3),nu=3), G=list(G1=list(V=diag(3), nu=3.02, alpha.mu=rep(0,3), alpha.V=1000*diag(3)), G2=list(V=diag(3), nu=3.02, alpha.mu=rep(0,3), alpha.V=1000*diag(3) ) ) ) m2b <- MCMCglmm(cbind(professional, relational, practice) ~ -1 + trait + trait:income + trait:age + trait:ethnicity + trait:gender + trait:residence + trait:education + trait:HDI + trait:Ineq + trait:Enviro + trait:Enviro:gender + trait:Ineq:gender + trait:Ineq:ethnicity + trait:Ineq:income, random= ~us(trait):block + us(trait):ID, rcov= ~us(trait):units, family=rep("gaussian",3), prior=prior2, nitt <- 80000, thin <- 25, burnin <- 50000, data=df_sel, verbose=TRUE) Thank you immensely much for any help! Warmly, Rafter Rafter Sass Ferguson, MS PhD Candidate | Crop Sciences Department University of Illinois in Urbana-Champaign liberationecology.org 518 567 7407
MCMCglmm predictions with new data for multi-response model
1 message · rafter sass ferguson