Skip to content

Tukey HSD

3 messages · jlpugh, Peter Dalgaard, ilai

#
Hey Folks: 

New to R and learning as I go, but really, I know just enough to get myself
into trouble. I've waded through everything up till now, and don't see
anything in the search that is directly helpful for the POS that I've
created.

The GOAL: All I want in the world is a program that performs 1-way ANOVA's
on every column in a data set (taking the first column as the definition of
the groups) and then spits out ONLY those results that were significant (p
<= whatever I want), with their respective plots and TukeyHSD results.
Sounds simple, right?

Data:

http://r.789695.n4.nabble.com/file/n4374072/test_text.txt test_text.txt 

(see upload)

CODE:  ( which occurs after TEST <- read.table("test_text.txt") )

i <- 2;
sink (file = "test_output.txt", append = FALSE)
mydf <- data.frame(TEST)
for (j in 1:ncol(mydf)-1) {
	var1 <- mydf[,1]
	var2 <- mydf[,i]
	fm1 <- aov(var1 ~ var2)
	tky <- TukeyHSD(fm1)
	otpt <- capture.output(summary(fm1))
	i <- i+1;
	lines <- as.vector(unlist(strsplit(otpt[2]," ")),mode="list") # gets the
p-value
	if (grepl("[1234567890]",lines[14],perl = TRUE)) { #make sure that slot for
p-value has a number
		number <- as.numeric(lines[14]) # make it numeric for logic test
		if (number <= 0.1) {
			cat(otpt, sep = "\n\n")
			cat(tky, sep = "\n\n")
			quartz(boxplot(mydf[,i] ~ mydf[,1]))
		}
	}	
}				
sink()

The ERROR: 

Error in TukeyHSD.aov(fm1) : no factors in the fitted model
In addition: Warning message:
In replications(paste("~", xx), data = mf) : non-factors ignored: var2

It works if I leave out " tky <- TukeyHSD(fm1) " and the subsequent cat.
I've tried doing the Tukey on the mydf[,1].. itself, changing their classes,
etc. Perhaps the whole approach is flawed. 

THANKS!


--
View this message in context: http://r.789695.n4.nabble.com/Tukey-HSD-tp4374072p4374072.html
Sent from the R help mailing list archive at Nabble.com.
#
On Feb 9, 2012, at 20:28 , jlpugh wrote:

            
Quite likely, this could all be done more elegantly (lapply springs to mind), and there could be something in there that won't work at all, but to point out the glaring obvious: If the grouping factor is in column 1, then shouldn't you have var2~var1 in the aov() call?
and, by the way, you _really_ do not want to go via capture.output to get at the p value. It is accessible from the summary(fm1) directly. For aov objects, it's like:

summary(fm1)[[1]]$"Pr(>F)"[1]

(for "lm" objects, try anova(fm)$"Pr(>F)"[1])

  
    
#
Also you want " for(i in 2:ncol(mydf)) { ..."
Your current setup of 1:ncol(mydf)-1 picks columns 0,1,2,...,ncol-1
(missing the last), which is not what you want...
Setting i<-2 in the first line was overridden by the call to loop.

Cheers
On Thu, Feb 9, 2012 at 2:33 PM, peter dalgaard <pdalgd at gmail.com> wrote: