Skip to content
Back to formatted view

Raw Message

Message-ID: <Pine.LNX.4.44.0202210930490.1759-100000@tal.stat.umu.se>
Date: 2002-02-21T08:37:09Z
From: Göran Broström
Subject: Pointer to covariates?
In-Reply-To: <20020220210637.312D73ED3@sitemail.everyone.net>

On Wed, 20 Feb 2002, Gabor Grothendieck wrote:

> In the first line, use the dist function, found in library mva,
> to get the distance between each pair of rows.   From this
> calculate an incidence matrix for which element i,j is true if 
> row i in dat equals row j in dat (and false elsewhere).
> 
> In the second line, for each row calculate the indices of 
> the matching rows and take the minimum of those as the key.
> 
> incid <- as.matrix(dist(dat[,-1],method="max"))==0
> keys <- unlist(lapply(apply(incid,1,which),min))

Thank you very much! This is very fast, much faster than my attempts
so far, but it has two drawbacks:

1. It  gives pointers to first occurrences in the _original_ data frame,
not the 'unique' version.

2. The first step results in a _huge_ matrix 'incid', too huge for my 
applications.

However, this is a promising first attempt, and I will try to refine
the idea. Again, thanks!

G?ran

> 
> --- G?ran Brostr?m <gb at stat.umu.se> wrote:
> >I have a dataframe 'dat' with one response and some covariates. Many 
> >observations  (rows), but only a few unique combinations of 
> >the covariates. Let's say that the response is in column 1, and 
> >the covariates in columns 2:k.
> >
> >I want to do 
> >
> >> covar <- unique.data.frame(dat[, 2:k])
> >> y <- dat[, 1]
> >> keys <- ??????
> >
> >where 'keys' should be a vector of length length(y) and contain the
> >row numbers in 'covar', where the response will find its covariates.
> >
> >Example:
> >
> >> dat
> >  y x1 x2
> >1 1  1  0
> >2 2  0  1
> >3 3  1  0
> >
> >> unique.data.frame(dat[, 2:3])
> >  x1 x2
> >1  1  0
> >2  0  1
> >
> >> keys
> >1  1
> >2  2
> >3  1
> >
> >But how do I get 'keys'?
> >-- 
> > G?ran Brostr?m                      tel: +46 90 786 5223
> > professor                           fax: +46 90 786 6614
> > Department of Statistics            http://www.stat.umu.se/egna/gb/
> > Ume? University
> > SE-90187 Ume?, Sweden             e-mail: gb at stat.umu.se
> >
> >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> >Send "info", "help", or "[un]subscribe"
> >(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 
> _____________________________________________________________
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

-- 
 G?ran Brostr?m                      tel: +46 90 786 5223
 professor                           fax: +46 90 786 6614
 Department of Statistics            http://www.stat.umu.se/egna/gb/
 Ume? University
 SE-90187 Ume?, Sweden             e-mail: gb at stat.umu.se

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._