using functions with multiple arguments in the "apply" family
chipmaney wrote:
typically, the apply family wants you to use vectors to run functions on. However, I have a function, kruskal.test, that requires 2 arguments. kruskal.test(Herb.df$Score,Herb.df$Year) This easily computes the KW ANOVA statistic for any difference across years.... However, my data has multiple sites on which KW needs to be run... here's the data: Herb.df<- data.frame(Score=rep(c(2,4,6,6,6,5,7,8,6,9),2),Year=rep(c(rep(1,5),rep(2,5)),2),Site=c(rep(3,10),rep(4,10))) However, if I try this: tapply(Herb.df,Herb.df$Site,function(.data) kruskal.test(.data$Indicator_Rating,.data$Year))
Error in tapply(Herb.df, Herb.df$ID, function(.data)
kruskal.test(.data$Indicator_Rating, : arguments must have same length How can I vectorize the kruskal.test() for all sites using tapply() in lieu of a loop?
Your example data makes little sense; you have precisely the same data for both sites and you have only two sites (why do kruskal.test on two sites?). Finally, you need to decide what your response variable is: 'Score' or 'Indicator_Rating'. So here's some made-up data and the use of by() to apply the test to each site: dat <- data.frame(y = rnorm(60), yr=gl(4,5,60), st=gl(3,20)) with(dat, by(dat, st, function(x) kruskal.test(y~yr, data=x))) See the last example in ?by. -Peter Ehlers
Peter Ehlers University of Calgary