Skip to content

split() is slow on data.frame (PR#14123)

1 message · Peng Yu

#
I make a version for matrix. Because, it would be more efficient to
split each column of a matrix than to convert a matrix to a data.frame
then call split() on the data.frame. Note that the version for a
matrix and a data.frame is slightly different. Would somebody add this
in R as well?

split.matrix<-function(x,f) {
?#print('processing matrix')
?v=lapply(
? ? ?1:dim(x)[[2]]
? ? ?, function(i) {
? ? ? ?base:::split.default(x[,i],f)#the difference is here
? ? ?}
? ? ?)

?w=lapply(
? ? ?seq(along=v[[1]])
? ? ?, function(i) {
? ? ? ?result=do.call(
? ? ? ? ? ?cbind
? ? ? ? ? ?, lapply(v,
? ? ? ? ? ? ? ?function(vj) {
? ? ? ? ? ? ? ? ?vj[[i]]
? ? ? ? ? ? ? ?}
? ? ? ? ? ? ? ?)
? ? ? ? ? ?)
? ? ? ?colnames(result)=colnames(x)
? ? ? ?return(result)
? ? ?}
? ? ?)
?names(w)=names(v[[1]])
?return(w)
}
On Wed, Dec 9, 2009 at 5:44 PM, Charles C. Berry <cberry at tajo.ucsd.edu> wrote: