Skip to content

Create subsets of data

5 messages · Pavan G, Steve Lianoglou, David Winsemius

#
Hi,
On Mon, May 9, 2011 at 9:40 AM, Pavan G <pavan.namd at gmail.com> wrote:
The solution to this question will also be similar, I guess ... you
should take some time to figure out how indexing with (different types
of) vectors works:

http://cran.r-project.org/doc/manuals/R-intro.html#Index-vectors

If your question doesn't have to do with selecting the appropriate
elements of your data, are you asking how to take a "mean" or std.dev
of the subset of points you know how to select? It's not clear what
you mean by "mean" ... do you want the average value of x vs. avg.
value of y ... or do you want the "centroid" of a set of points in
each quadrant, or ... what?
A curious choice for a valediction ...
#
On May 9, 2011, at 9:40 AM, Pavan G wrote:

            
I am assuming that you have yet a third vector that has values at each  
of those coordinates and it is that set of categorized values for  
which you want the means and std deviations: I will exemplify the use  
of tapply on cut()-igorized vectors and a z-value which is the L1  
distance from the origin. (And excuse the earlier feeble attempt at  
humor);

 > x <- runif(2000, 0,2)
 > y <- runif(2000, 0, 2)
 > xc <- cut(x, c(0,1,2))
 > yc <- cut(y, c(0,1,2))
 > z <- x+y
 > tapply(z, list(xc,yc), mean)
          (0,1]    (1,2]
(0,1] 1.013192 2.016095
(1,2] 1.979930 2.996229  # seems to make sense
 > tapply(z, list(xc,yc), sd)
           (0,1]     (1,2]
(0,1] 0.4028310 0.4133113
(1,2] 0.4239014 0.3984559 # also seems sensible

  
    
#
On May 9, 2011, at 3:46 PM, David Winsemius wrote:

            
(Actually, after checking help(dist)  I think the sum(x_i) is called  
the Canberra distance. )
David Winsemius, MD
West Hartford, CT