Hi folks, Visual data analysis is very important in my field (cognitive science), and while I know that you can obtain confidence intervals for the cells of a fixed-effects design (as described at http://glmm.wikidot.com/faq ), and confidence intervals for each effect/interaction via MCMC, these approaches (at least, as I understand them) fail to completely satisfy me. I can provide further details details on my (possibly naive) dissatisfaction if necessary, but for now I would be grateful for feedback on a solution I've come up with that lets me visualize any level of the data I choose. The approach I take is to obtain the model predictions for each cell of the fixed-effects design, then bootstrap distributions of predictions for each cell. The data I typically encounter have only one random effect (experiment participant), and many observations within each participant and cell of the fixed-effects design, so on each iteration of the bootstrap I resample participants then resample observations independently within each individual in the new sample of participants (as recommended by: http://stats.stackexchange.com/questions/1399/obtaining-and-interpreting-bootstrapped-confidence-intervals-from-hierarchical-da). This yields distributions of predicted values for each cell of the fixed-effects design which can be used to generate CIs for each cell, but also can be used to compute the CI for any effect/interaction. For example, if I suspect (whether a priori or by looking at the t-values from the original model) that there's an interaction between the 2-level and 4-level predictors, I can generate 2 useful graphs: (1) collapse the 3-level predictor to a mean within each iteration and plot the resulting set of 8 means and associated CIs. (2) collapse the 3-level predictor to a mean within each iteration *then* collapse the 2-level predictor to a difference score within each iteration and plot the resulting set of 4 means and associated CIs If I have a numeric predictor in the model, I would obtain predictions across a set of values across the range of this predictor (eg. seq( min(nIV) , max(nIV) , length.out = 1e3 ) ). One thing I like about this approach is that within each predictor I don't have to specify an intercept level to which the other levels are compared. Furthermore, where I typically deal with data that is strongly positively skewed (human response times), I wonder if the non-parametric-ness of bootstrapping actually improves inference relative to looking simply at the t-values from the original model or anova()-based sub-model comparisons, both of which assume gaussian error as I understand it. I'd appreciate feedback on the reasonability of this approach. Cheers, Mike
Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~