Create subsets of data - R-help

Pavan G

Mon, May 9, 2011 6:40 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110509/8d20bfce/attachment.pl>

Steve Lianoglou

Mon, May 9, 2011 12:14 PM #

Hi,

On Mon, May 9, 2011 at 9:40 AM, Pavan G <pavan.namd at gmail.com> wrote:

The solution to this question will also be similar, I guess ... you
should take some time to figure out how indexing with (different types
of) vectors works:

http://cran.r-project.org/doc/manuals/R-intro.html#Index-vectors

If your question doesn't have to do with selecting the appropriate
elements of your data, are you asking how to take a "mean" or std.dev
of the subset of points you know how to select? It's not clear what
you mean by "mean" ... do you want the average value of x vs. avg.
value of y ... or do you want the "centroid" of a set of points in
each quadrant, or ... what?

A curious choice for a valediction ...

Steve Lianoglou
Graduate Student: Computational Systems Biology
?| Memorial Sloan-Kettering Cancer Center
?| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

David Winsemius

Mon, May 9, 2011 12:46 PM #

On May 9, 2011, at 9:40 AM, Pavan G wrote:

I am assuming that you have yet a third vector that has values at each  
of those coordinates and it is that set of categorized values for  
which you want the means and std deviations: I will exemplify the use  
of tapply on cut()-igorized vectors and a z-value which is the L1  
distance from the origin. (And excuse the earlier feeble attempt at  
humor);

 > x <- runif(2000, 0,2)
 > y <- runif(2000, 0, 2)
 > xc <- cut(x, c(0,1,2))
 > yc <- cut(y, c(0,1,2))
 > z <- x+y
 > tapply(z, list(xc,yc), mean)
          (0,1]    (1,2]
(0,1] 1.013192 2.016095
(1,2] 1.979930 2.996229  # seems to make sense
 > tapply(z, list(xc,yc), sd)
           (0,1]     (1,2]
(0,1] 0.4028310 0.4133113
(1,2] 0.4239014 0.3984559 # also seems sensible

David Winsemius, MD
West Hartford, CT

David Winsemius

Mon, May 9, 2011 12:57 PM #

On May 9, 2011, at 3:46 PM, David Winsemius wrote:

(Actually, after checking help(dist)  I think the sum(x_i) is called  
the Canberra distance. )

David Winsemius, MD
West Hartford, CT

Pavan G

Tue, May 10, 2011 8:32 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110510/5296bb41/attachment.pl>