vectorizing ANOVA over a vectorized linear model
Hi Mark, Unless you are fitting millions of very very very simple models, I doubt that extracting p-values is going to be a limiting factor in the speed of your analysis. Hadley
On Mon, Mar 8, 2010 at 3:47 AM, Mark Kimpel <mwkimpel at gmail.com> wrote:
Hadley, Thanks for pointing me to some good articles. Unfortunately, I have already read Holger's and my main concern is computational efficiency. The buzzword on this list regarding efficient code is "vectorization". I am, frankly, surprised that there is a way to vectorize analysis of complex models but not to extract p values from them. Dieter's reply points one towards using lapply, which in my experience allows for compact code but not an increase in efficiency (one of Holger's examples demonstrates this). Anyway, I cannot see how to go from Holger's fairly simple examples to one that involves a complex model with several factors and interactions. Limma, which does provide p values if contrasts are used, is blindingly fast but I believe Gordon Smyth has hard-coded most of this excellent package in C. I was hoping to achieve something similar without the use of the moderated t-statistics that Limma uses. Looks like I am stuck using loops with mcapply. Thank goodness for my Corei7! Mark Mark W. Kimpel MD ?** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN ?46074 (317) 490-5129 Work, & Mobile & VoiceMail (317) 399-1219 Skype No Voicemail please On Sun, Mar 7, 2010 at 2:08 PM, hadley wickham <h.wickham at gmail.com> wrote:
Hi Mark, If efficiency is a concern you might want to read "Computing Thousands of Test Statistics Simultaneously in R" by Holger Schwender and Tina M?ller, http://stat-computing.org/newsletter/issues/scgn-18-1.pdf. If you just want to do it, see the examples in http://had.co.nz/plyr/plyr-intro-090510.pdf. Hadley On Sun, Mar 7, 2010 at 7:03 PM, Mark Kimpel <mwkimpel at gmail.com> wrote:
Is it possible to vectorize anova over the output of a vectorized lm? ?I have a gene expression matrix with each row being a gene and columns for samples. There are several factors with interactions. I can get p values by looping over the matrix with lm and anova, but I would like to make this as computationally efficient as possible. I am able to vectorize the lm command, but when I try to use anova on the resultant model object I get just one anova result. Is what I want to do possible? And, yes, I am quite conversant with Limma and other BioC packages, I have my reasons for wanting to use lm and anova. Thanks, Mark Mark W. Kimpel MD ?** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN ?46074 (317) 490-5129 Work, & Mobile & VoiceMail (317) 399-1219 Skype No Voicemail please ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/