Hi folks, Here are code and files for the analysis. -------------- next part -------------- Hi folks, Repost/summary of previous correspondence. Thanks to Ben Bolker, Spencer Graves and Ken Beath for questions and comments. I have doubts about the MCMC results with the most recent version of lme4 (versions ....-20 on CRAN, and -21 on R-Forge). I recently tried analyzing a moderately sized (n=~500) unbalanced data set where lme and lmer gave similar F ratios. lmer simulations gave the same F-tests (via simulation) as lme, but the MCMC output (using Helmert and sum contrasts) depended wildly on terms that were retained in the model. Unfortunately, it seems to depend on the data, and I cannot make the data (widely) publicly available, although I could send it to individuals. My original question and compilation of other emails: Are there general situations in which we might expect very different answers from F tests vs. mcmcpvalue with orthogonal contrasts (Helmert)? I helping someone with a normal linear model with a moderate sized, noisy data set, and I am getting very different probabilities between F-tests and mcmcpvalue for some interactions. (Output appended.) I get similar F-test results whether I use lm (and ignore the random effect of subject), lme, and lmer with an DDF approximation and via simulation (lmer::simulate, of the null hypothesis indicate that F- stats as large (or larger) than my observed F-stats are VERY unlikely, under the null hypothesis). When I use mcmcpvalue, I get huge changes in P-value of a main effect (0.6 to 0.01) when I remove its interactions. In contrast, the F-test (using trace of the hat matrix DF's) are much more consistent when I change the fixed effect structure. Is mcmcpvalue much more sensitive to overfitting the model? In some cases, removing the interactions results in a lower AIC (with ML fits). The MCMC sample traces look (in my limited experience) withoutpeculiarities, and the densityplots are all quite symmetrical and normal-ish. In the full model, we have 28 fixed coefs (22 continuous variables or slope interactions) and about 500 obs of about 200 subjects (2-3 observations per subject). The data are VERY unbalanced. The QQ plots look normal, but highlight the lack of balance (from one to dozens of reps per treatment combo). OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT: The following tables are output of factors/covariates from a function I wrote to use with lmer. "P.sum" is an F-test using one hack for denominator degrees of freedom (these agree qualitatively with lme). "P(H|D)" is the empirical P-value generated from Bates' function mcmcpvalue, with my hack for the new structure of the latest mcmcsamp output (NOTE that I checked that these P-values also agree with direct HPDinterval and quantile assessments of individual coefficients). FULL MODEL
aov.Bayes(modcA, n=1000)$aov.tab
Df Sum.Sq Mean.Sq F.value P.sum P(H|D) ageinj 1 127.00 127.00 2.6559 0.1038 0.096 sex 1 233.32 233.32 4.8793 0.0277 0.008 race 1 216.49 216.49 4.5273 0.0339 0.372 zses 1 249.03 249.03 5.2078 0.0229 0.010 tsi 1 138.05 138.05 2.8870 0.0900 0.005 tsq 1 393.21 393.21 8.2230 0.0043 0.008 grp 3 324.45 108.15 2.2617 0.0805 0.648 zfad 1 988.80 988.80 20.6782 0.0000 0.051 tsi:grp 3 460.46 153.49 3.2098 0.0229 0.364 tsq:grp 3 229.16 76.39 1.5974 0.1892 0.541 grp:zfad 3 89.48 29.83 0.6238 0.5999 0.555 tsi:zfad 1 84.01 84.01 1.7568 0.1857 0.963 tsq:zfad 1 78.01 78.01 1.6314 0.2021 0.633 tsi:grp:zfad 3 1026.88 342.29 7.1582 0.0001 0.812 tsq:grp:zfad 3 123.36 41.12 0.8599 0.4618 0.884 Residuals (Tr(hat)) 469 NA 47.82 NA NA NA FULL MODEL minus one three way interaction changes P(H|D) for the other 3-way.
aov.Bayes(modcB, n=1000)$aov.tab
Df Sum.Sq Mean.Sq F.value P.sum P(H|D) ageinj 1 127.85 127.85 2.670 0.1029 0.094 sex 1 234.75 234.75 4.902 0.0273 0.008 race 1 217.92 217.92 4.551 0.0334 0.357 zses 1 250.62 250.62 5.234 0.0226 0.017 tsi 1 138.01 138.01 2.882 0.0902 0.007 tsq 1 393.04 393.04 8.208 0.0044 0.010 grp 3 326.62 108.87 2.274 0.0793 0.646 zfad 1 994.61 994.61 20.771 0.0000 0.024 tsi:grp 3 460.45 153.48 3.205 0.0230 0.334 tsq:grp 3 229.34 76.45 1.596 0.1894 0.453 grp:zfad 3 90.07 30.02 0.627 0.5978 0.292 tsi:zfad 1 83.88 83.88 1.752 0.1863 0.777 tsq:zfad 1 77.92 77.92 1.627 0.2027 0.358 tsi:grp:zfad 3 1026.92 342.31 7.148 0.0001 0.015 Residuals (Tr(hat)) 472 NA 47.89 NA NA NA
NO INTERACTIONS
aov.Bayes(modcD, n=1000)$aov.tab
Df Sum.Sq Mean.Sq F.value P.sum P(H|D) ageinj 1 96.66 96.66 1.847 0.1748 0.061 sex 1 269.07 269.07 5.141 0.0238 0.005 race 1 214.66 214.66 4.102 0.0434 0.064 zses 1 257.90 257.90 4.928 0.0269 0.030 tsi 1 137.91 137.91 2.635 0.1052 0.044 tsq 1 415.80 415.80 7.945 0.0050 0.069 grp 3 373.93 124.64 2.382 0.0688 0.001 Residuals (Tr(hat)) 490 NA 52.34 NA NA NA
sessionInfo()
R version 2.7.1 (2008-06-23) i386-apple-darwin8.10.1 locale: C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [1] foreign_0.8-26 lme4_0.999375-21 Matrix_0.999375-10 [4] lattice_0.17-8 Hmisc_3.4-3 loaded via a namespace (and not attached): [1] cluster_1.11.11 grid_2.7.1
Are there general situations in which we might expect very different answers from F tests vs. mcmcpvalue with orthogonal contrasts (Helmert)? I helping someone with a normal linear model with a moderate sized (n=500), noisy data set, and I am getting very different probabilities between F-tests and mcmcpvalue for some interactions. I get similar F-test results whether I use lm (and ignore the random effect of subject), lme, and lmer with an DDF approximation. When I use mcmcpvalue, I get huge changes in P-value of a main effect (0.6 to 0.01) when I remove its interactions. In contrast, the F-test (using trace of the hat matrix DF's) are much more consistent when I change the fixed effect structure. I think mcmcpvalue is much more sensitive to overfitting the model. In some cases, removing the interactions results in a lower AIC (with ML fits). In the full model, we have 28 fixed coefs (22 continuous variables or slope interactions) and about 500 obs. The data are VERY unbalanced.
On Jul 10, 2008, at 9:29 AM, Gillian Raab wrote:
MCMC confidence intervals (strictly Bayesian credible intervals) give much better results in some circumstances, notably for parameters that are poorly estimated. Things are seldom a problem (In my experience) for fixed effects unless the model is ill-specified, but the mcmc methods really score for random effect paramters. 2008/7/10 Doran, Harold <HDoran at air.org>:
Out of curiousity, why not just use the asymptotic standards errors of the fixed effects to get Cis rather than via simulation?
-----Original Message----- From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Jonathan Baron Sent: Thursday, July 10, 2008 8:05 AM To: Michael Kubovy Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] State of lme4 On 07/10/08 07:51, Michael Kubovy wrote:
Dear Friends, I have become confused as to which set-up of lme4, arm,
gmodels, etc
will produce CIs on the fixed effects by simulation.
In the latest version of lme4, mcmcsamp and HPDinterval work. And languageR has a new version as of yesterday, which seems to deal correctly with the current version of lme4. Specifically, pvals.fnc() works. I don't know about arm and gmodels. I haven't seen new versions, so I suspect they will not deal with the new format of lmer. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Gillian M Raab
10 Ainslie Place EH3 6AS
tel 0131 226 6234
mobile 07748 678 551
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dr. Hank Stevens, Associate Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.cas.muohio.edu/ecology http://www.muohio.edu/botany/ "If the stars should appear one night in a thousand years, how would men believe and adore." -Ralph Waldo Emerson, writer and philosopher (1803-1882) _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models