An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110509/8d20bfce/attachment.pl>
Create subsets of data
5 messages · Pavan G, Steve Lianoglou, David Winsemius
Hi,
On Mon, May 9, 2011 at 9:40 AM, Pavan G <pavan.namd at gmail.com> wrote:
Hello All, Let's say I have points on a x-y plane. x ranges from 0-2 and y from 0-2. There are points in quadrants x[0:1]---y[0:1] and in x[1:2]----y[1:2]. I would like to get the mean and std of the points in the x[0:1]----y[0:1] quadrant alone. Is there a straight forward way to do it? I asked a similar question a few days ago regarding plotting a subset of data using conditions. The solution was: http://r.789695.n4.nabble.com/Conditional-plot-length-in-R-td3503855.html
The solution to this question will also be similar, I guess ... you should take some time to figure out how indexing with (different types of) vectors works: http://cran.r-project.org/doc/manuals/R-intro.html#Index-vectors If your question doesn't have to do with selecting the appropriate elements of your data, are you asking how to take a "mean" or std.dev of the subset of points you know how to select? It's not clear what you mean by "mean" ... do you want the average value of x vs. avg. value of y ... or do you want the "centroid" of a set of points in each quadrant, or ... what?
Thank you, Why-so-serious.
A curious choice for a valediction ...
Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
On May 9, 2011, at 9:40 AM, Pavan G wrote:
Hello All, Let's say I have points on a x-y plane. x ranges from 0-2 and y from 0-2. There are points in quadrants x[0:1]---y[0:1] and in x[1:2]---- y[1:2]. I would like to get the mean and std of the points in the x[0:1]---- y[0:1] quadrant alone. Is there a straight forward way to do it?
I am assuming that you have yet a third vector that has values at each
of those coordinates and it is that set of categorized values for
which you want the means and std deviations: I will exemplify the use
of tapply on cut()-igorized vectors and a z-value which is the L1
distance from the origin. (And excuse the earlier feeble attempt at
humor);
> x <- runif(2000, 0,2)
> y <- runif(2000, 0, 2)
> xc <- cut(x, c(0,1,2))
> yc <- cut(y, c(0,1,2))
> z <- x+y
> tapply(z, list(xc,yc), mean)
(0,1] (1,2]
(0,1] 1.013192 2.016095
(1,2] 1.979930 2.996229 # seems to make sense
> tapply(z, list(xc,yc), sd)
(0,1] (1,2]
(0,1] 0.4028310 0.4133113
(1,2] 0.4239014 0.3984559 # also seems sensible
David Winsemius, MD West Hartford, CT
On May 9, 2011, at 3:46 PM, David Winsemius wrote:
On May 9, 2011, at 9:40 AM, Pavan G wrote:
Hello All, Let's say I have points on a x-y plane. x ranges from 0-2 and y from 0-2. There are points in quadrants x[0:1]---y[0:1] and in x[1:2]---- y[1:2]. I would like to get the mean and std of the points in the x[0:1]---- y[0:1] quadrant alone. Is there a straight forward way to do it?
I am assuming that you have yet a third vector that has values at each of those coordinates and it is that set of categorized values for which you want the means and std deviations: I will exemplify the use of tapply on cut()-igorized vectors and a z-value which is the L1 distance from the origin.
(Actually, after checking help(dist) I think the sum(x_i) is called the Canberra distance. )
(And excuse the earlier feeble attempt at humor);
x <- runif(2000, 0,2) y <- runif(2000, 0, 2) xc <- cut(x, c(0,1,2)) yc <- cut(y, c(0,1,2)) z <- x+y tapply(z, list(xc,yc), mean)
(0,1] (1,2] (0,1] 1.013192 2.016095 (1,2] 1.979930 2.996229 # seems to make sense
tapply(z, list(xc,yc), sd)
(0,1] (1,2] (0,1] 0.4028310 0.4133113 (1,2] 0.4239014 0.3984559 # also seems sensible --
David Winsemius, MD West Hartford, CT
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110510/5296bb41/attachment.pl>