Pointer to covariates?
Here is another idea, but the overhead might be just as great. dat_data.frame(y=1:3,x1=c(1,0,1),x2=c(0,1,0)) dat.unique_unique(paste(as.character(dat$x1),as.character(dat$x2))) dat.keys_match(paste(as.character(dat$x1),as.character(dat$x2)),dat.unique)
dat
y x1 x2 1 1 1 0 2 2 0 1 3 3 1 0
dat.unique
[1] "1 0" "0 1"
dat.keys
[1] 1 2 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Anne E. York National Marine Mammal Laboratory Seattle WA 98115-0070 USA e-mail: anne.york at noaa.gov Voice: +1 206-526-4039 Fax: +1 206-526-6615 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
On Thu, 21 Feb 2002, [iso-8859-1] Göran Broström wrote:
On Wed, 20 Feb 2002, Gabor Grothendieck wrote:
In the first line, use the dist function, found in library mva, to get the distance between each pair of rows. From this calculate an incidence matrix for which element i,j is true if row i in dat equals row j in dat (and false elsewhere). In the second line, for each row calculate the indices of the matching rows and take the minimum of those as the key. incid <- as.matrix(dist(dat[,-1],method="max"))==0 keys <- unlist(lapply(apply(incid,1,which),min))
Thank you very much! This is very fast, much faster than my attempts so far, but it has two drawbacks: 1. It gives pointers to first occurrences in the _original_ data frame, not the 'unique' version. 2. The first step results in a _huge_ matrix 'incid', too huge for my applications. However, this is a promising first attempt, and I will try to refine the idea. Again, thanks! Göran
--- Göran Broström <gb at stat.umu.se> wrote:
I have a dataframe 'dat' with one response and some covariates. Many observations (rows), but only a few unique combinations of the covariates. Let's say that the response is in column 1, and the covariates in columns 2:k. I want to do
covar <- unique.data.frame(dat[, 2:k]) y <- dat[, 1] keys <- ??????
where 'keys' should be a vector of length length(y) and contain the row numbers in 'covar', where the response will find its covariates. Example:
dat
y x1 x2 1 1 1 0 2 2 0 1 3 3 1 0
unique.data.frame(dat[, 2:3])
x1 x2 1 1 0 2 0 1
keys
1 1 2 2 3 1 But how do I get 'keys'? -- Göran Broström tel: +46 90 786 5223 professor fax: +46 90 786 6614 Department of Statistics http://www.stat.umu.se/egna/gb/ Umeå University SE-90187 Umeå, Sweden e-mail: gb at stat.umu.se -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
_____________________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-- Göran Broström tel: +46 90 786 5223 professor fax: +46 90 786 6614 Department of Statistics http://www.stat.umu.se/egna/gb/ Umeå University SE-90187 Umeå, Sweden e-mail: gb at stat.umu.se -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._