Skip to content

aov or t-test applied on all variables of a data.frame

5 messages · Thomas Lumley, Dimitris Rizopoulos, Peter Dalgaard +1 more

#
Hi
I have a data.frame with say 10 continuous variables and one grouping 
factor (say 3 levels)

how can I easily (without loops) apply for each continous variable e.g. 
an aov, with the grouping factor as my factor (or if the grouping factor 
has 2 levels, eg. a t-test)

thanks for a hint

cheers

christoph
#
On Fri, 11 Mar 2005, Christoph Lehmann wrote:

            
You can call aov() or lm() with a multicolumn response variable.
Response y1 :
             Df Sum Sq Mean Sq F value Pr(>F)
factor(x)    2  0.187   0.093  0.0735 0.9293
Residuals   27 34.326   1.271

  Response y2 :
             Df Sum Sq Mean Sq F value Pr(>F)
factor(x)    2  0.133   0.066  0.0497 0.9516
Residuals   27 36.107   1.337

  Response y3 :
             Df Sum Sq Mean Sq F value  Pr(>F)
factor(x)    2  6.051   3.026  2.5605 0.09589 .
Residuals   27 31.903   1.182
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1


 	-thomas
#
you mean something like this:

dat <- data.frame(matrix(rnorm(10*100), 100), f=sample(letters[1:3], 
100, TRUE))
models <- lapply(dat[sapply(dat, is.numeric)], function(x, f) 
aov(x~f), f=dat$f)
#################
models
lapply(models, summary)

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.ac.be/biostat/
     http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Christoph Lehmann" <christoph.lehmann at gmx.ch>
To: <r-help at stat.math.ethz.ch>
Sent: Friday, March 11, 2005 4:21 PM
Subject: [R] aov or t-test applied on all variables of a data.frame
#
Christoph Lehmann <christoph.lehmann at gmx.ch> writes:
Generally something with lapply or sapply, e.g.

lapply(dd[-1], function(y) t.test(y~dd$V1))

$V2

        Welch Two Sample t-test

data:  y by dd$V1
t = 1.5465, df = 39.396, p-value = 0.13
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.02500802  0.18764439
sample estimates:
mean in group 1 mean in group 2
       1.096818        1.015500

...etc, one for each of V2..V8

or, in a more compact form 

sapply(dd[-1], function(y) t.test(y~dd$V1))[1:3,]

          V2        V3        V4         V5         V6        V7
statistic 1.546456  1.008719  0.08158578 -0.2456436 -0.872376 -1.405966
parameter 39.39554  36.30778  39.70288   36.99061   36.99944  35.97947
p.value   0.1299909 0.3197851 0.935386   0.807316   0.3886296 0.1683118
          V8
statistic -0.6724112
parameter 29.65156
p.value   0.5065284

or (this'll get the confidence intervals and estimates printed sensibly).

sapply(dd[-1], function(y)unlist(t.test(y~dd$V1)[1:5]))
#
many thanks for the sapply hint. How can I use sapply for a compact 
result of the aov computation, say I call

sapply(dd[-1], function(y, f) aov(y ~ f), f = dd$V1)

aov gives the result in another form than t.test

thanks a lot
Peter Dalgaard wrote: