An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080927/e4286c40/attachment.pl>
quantile / centile
4 messages · Henrique Dallazuanna, Donald Braman, Peter Dalgaard
Try this: my.df$my.newvar <- quantile(my.df$my.var, probs = seq(0.01,1, 0.01))
On Sat, Sep 27, 2008 at 3:50 AM, Donald Braman <dbraman at law.gwu.edu> wrote:
I'm wondering if there is a simple way to assign a quantile to a vector in a
data frame, much like one could in Stata using centile. Let's say I want 100
slices in my assignation. I can easily see what the limits of each slice by
using quantile:
quantile(my.df$my.var, probs=seq(0, 1, 0.01))
But how do I assign the appropriate value to each row/record in my data
frame? Clearly the following won't work, but what will?
my.df$my.new.var <- quantile(my.df$my.var, probs=seq(0, 1, 0.01))
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080927/25321bd8/attachment.pl>
Donald Braman wrote:
Thanks, for the response! Unfortunately, I was unclear; my problem is not that I need to know what the percentile ranges are, but that I need to assign an appropriate percentile range to each of the records in my dataframe. My dataframe contains somewhere between 1000 and 9000 rows/records in my dataframe (depending on context), not a hundred rows. That is, I'd like to assign a corresponding quantile value to each row that corresponds to the quantile() result for each record in my 1000-9000 row data frame. Thanks again for any help!
You can use cnt <- cut(x, quantile(x, seq(0,1,0.01)), include=TRUE) names(cnt) <- 1:100 # if you want to get rid of ugly interval labels With Harrells Hmisc packages, there's also cnt <- cut2(x, g=100) Or you can take a more basic approach and do N <-sum(!is.na(x)) cnt <- ceiling(rank(x)/N*100)
On Sat, Sep 27, 2008 at 8:54 AM, Henrique Dallazuanna <wwwhsd at gmail.com>wrote:
Try this:
my.df$my.newvar <- quantile(my.df$my.var, probs = seq(0.01,1, 0.01))
On Sat, Sep 27, 2008 at 3:50 AM, Donald Braman <dbraman at law.gwu.edu>
wrote:
I'm wondering if there is a simple way to assign a quantile to a vector
in a
data frame, much like one could in Stata using centile. Let's say I want
100
slices in my assignation. I can easily see what the limits of each slice
by
using quantile:
quantile(my.df$my.var, probs=seq(0, 1, 0.01))
But how do I assign the appropriate value to each row/record in my data
frame? Clearly the following won't work, but what will?
my.df$my.new.var <- quantile(my.df$my.var, probs=seq(0, 1, 0.01))
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
[[alternative HTML version deleted]] ------------------------------------------------------------------------
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907