Pointer to covariates?
On Fri, 22 Feb 2002, Prof Brian Ripley wrote:
On Fri, 22 Feb 2002, [iso-8859-1] G?ran Brostr?m wrote:
[...]
On Thu, 21 Feb 2002, Anne York wrote:
Here is another idea, but the overhead might be just as great. dat_data.frame(y=1:3,x1=c(1,0,1),x2=c(0,1,0)) dat.unique_unique(paste(as.character(dat$x1),as.character(dat$x2))) dat.keys_match(paste(as.character(dat$x1),as.character(dat$x2)),dat.unique)
This is very good! I made this function of it:
cro.ay.orig <- function(dat){
covar <- unique(dat[, -1])
dat.keys <-
match(paste(dat$x1, dat$x2, sep = ""),
paste(covar$x1, covar$x2, sep = ""))
return(y = dat[, 1],
covar = covar,
keys = dat.keys)
}
and this is fast; with 'dat' containing 100000 observations, I get:
unix.time(sor.ay.orig <- cro.ay.orig(dat[1:100000, c(1, 2, 5)))
[1] 1.00 0.02 1.08 0.00 0.00
However, this function needs to be generalized, so I wrote:
cro.ay <- function(dat, response = 1){
covar <- unique(dat[, -response, drop = FALSE])
dat.keys <-
match(apply(dat[, -response, drop = FALSE], 1, paste, collapse = ""),
apply(covar, 1, paste, collapse = ""))
return(y = dat[, response],
covar = covar,
keys = dat.keys)
}
but this was much slower (but acceptable) on the same data:
[1] 11.63 0.32 12.34 0.00 0.00
It is apparently the pasting row by row of the data frame,
apply(covar, 1, paste, collapse = "")
that takes the time. Is there a better way of doing this?
Very probably. Note that the original did not paste row-by-row. You could
use do.call. Here's an untested variant
match(do.call("paste", c(dat[, -response, drop = FALSE], sep="\001")),
do.call("paste", c(covar, sep="\001")))
Indeed! This is comparable in speed with the original. Thanks!
Note also that I used a different separator ("\r" is also possible), as
that is much more likely to make a unique string.
It took a while, but now I understand:
paste(111, 1, sep ="") == paste(11, 11, sep = "")
[1] TRUE 'sep=""' was about the worst choice...
See duplicated.data.frame for the use of this.
G?ran -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._