Skip to content
Prev 33079 / 398506 Next

converting "by" to a data.frame?

Thanks to Thomas Lumley, Sundar Dorai-Raj, and Don McQueen for their 
suggestions.  I need the INDICES as part of the output data.frame, which 
McQueen's solution provided.  I generalized his method as follows:

by.to.data.frame <-
function(x, INDICES, FUN){
# Split data.frame x on x[,INDICES]
# and lapply FUN to each data.frame subset,
# returning a data.frame
#
#  Internal functions
    get.Index <- function(x, INDICES){
	Ind <- as.character(x[,INDICES[1]])
	k <- length(INDICES)
	if(k > 1)
		Ind <- paste(Ind, get.Index(x, INDICES[-1]), sep=":")	
		Ind	
     }
     FUN2 <- function(data., INDICES, FUN){
	vec <- FUN(data.)
	Vec <- matrix(vec, nrow=1)
	dimnames(Vec) <- list(NULL, names(vec))
	cbind(data.[1,INDICES], Vec)
     }
#   Combine INDICES
     Ind <- get.Index(x, INDICES)
#   Apply ...:  Do the work.
     Split <- split(x, Ind)
     byFits <- lapply(Split, FUN2, INDICES, FUN)
#   Convert to a data.frame
     do.call('rbind',byFits) 	
}

Applying this to my toy problem produces the following:

 > by.df <- data.frame(A=rep(c("A1", "A2"), each=3),
+  B=rep(c("B1", "B2"), each=3), x=1:6, y=rep(0:1, length=6))
 >
 > by.to.data.frame(by.df, c("A", "B"), function(data.)coef(lm(y~x, data.)))
        A  B (Intercept)             x
A1:B1 A1 B1   0.3333333 -1.517960e-16
A2:B2 A2 B2   0.6666667  3.282015e-16

Thanks for the assistance.  I can now tackle the real problem that 
generated this question.

Best Wishes,
Spencer Graves
########################################
Don MacQueen wrote: