Hello all,
Thanks in advance for you attention.
I would like to generate a third value that represents the quantile
value of a variable in a data frame.
# generating data
x <- as.matrix(seq(1:30))
y <- as.matrix(rnorm(30, 20, 7))
tmp1 <- cbind(x,y)
dat <- as.data.frame(tmp1)
colnames(dat) <- c("id", "score")
dat
# finding percentiles of "score"
qs <- as.matrix(quantile(dat$score, type=3, probs = seq(0,1,.1)))
colnames(qs) <- c( "score")
qs
# is there a way to put the quantile value for a value of 'score'
into a new variable,
# such that the new data frame would have three variables: id, score
and q.score?
## running R version 2.8.1 (2008-12-22) on Vista
Thanks so much!
-Jon
create new variable: percentile value of variable in data frame
4 messages · Jonathan Beard, Stephan Kolassa, David Winsemius
Hi Jon, does the empirical cumulative distribution function do what you want? dat$q.score <- ecdf(dat$score)(dat$score) ?ecdf HTH Stephan Jonathan Beard schrieb:
Hello all,
Thanks in advance for you attention.
I would like to generate a third value that represents the quantile
value of a variable in a data frame.
# generating data
x <- as.matrix(seq(1:30))
y <- as.matrix(rnorm(30, 20, 7))
tmp1 <- cbind(x,y)
dat <- as.data.frame(tmp1)
colnames(dat) <- c("id", "score")
dat
# finding percentiles of "score"
qs <- as.matrix(quantile(dat$score, type=3, probs = seq(0,1,.1)))
colnames(qs) <- c( "score")
qs
# is there a way to put the quantile value for a value of 'score'
into a new variable,
# such that the new data frame would have three variables: id, score
and q.score?
## running R version 2.8.1 (2008-12-22) on Vista
Thanks so much!
-Jon
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
1 day later
Hi Stephan, thanks for your response. It looks like the ecdf() works like it should. I have a quick follow-up: I didn't notice any discussion in the help documents of the methods behind ecdf() and quantile(type=3) being equivalent. It looks like the results produced by each method are consistent. Any thoughts? Again, thanks so much, -Jon
On Fri, May 28, 2010 at 4:06 PM, Stephan Kolassa <Stephan.Kolassa at gmx.de> wrote:
Hi Jon, does the empirical cumulative distribution function do what you want? dat$q.score <- ecdf(dat$score)(dat$score) ?ecdf HTH Stephan Jonathan Beard schrieb:
Hello all,
Thanks in advance for you attention.
I would like to generate a third value that represents the quantile
value of a variable in a data frame.
# generating data
x <- as.matrix(seq(1:30))
y <- as.matrix(rnorm(30, 20, 7))
tmp1 <- cbind(x,y)
dat <- as.data.frame(tmp1)
colnames(dat) <- c("id", "score")
dat
# ?finding percentiles of "score"
qs <- as.matrix(quantile(dat$score, type=3, probs = seq(0,1,.1)))
colnames(qs) <- c( "score")
qs
# ?is there a way to put the quantile value for a value of 'score'
into a new variable,
# ?such that the new data frame would have three variables: id, score
and q.score?
## ?running R version 2.8.1 (2008-12-22) on Vista
Thanks so much!
-Jon
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On May 30, 2010, at 9:03 AM, Jonathan Beard wrote:
Hi Stephan, thanks for your response. It looks like the ecdf() works like it should. I have a quick follow-up: I didn't notice any discussion in the help documents of the methods behind ecdf() and quantile(type=3) being equivalent. It looks like the results produced by each method are consistent.
If you want a method that uses what you know to be type 3 quantile based consider: > dat$q.score <- findInterval(dat$score, qs) > dat You can adjust the parameters of findInterval to resolve to your specifications issues relating to which end of the interval is open.
David.
>
> Any thoughts?
>
> Again, thanks so much,
>
> -Jon
>
>
>
>
> On Fri, May 28, 2010 at 4:06 PM, Stephan Kolassa <Stephan.Kolassa at gmx.de
> > wrote:
>> Hi Jon,
>>
>> does the empirical cumulative distribution function do what you want?
>>
>> dat$q.score <- ecdf(dat$score)(dat$score)
>> ?ecdf
>>
>> HTH
>> Stephan
>>
>>
>> Jonathan Beard schrieb:
>>>
>>> Hello all,
>>>
>>> Thanks in advance for you attention.
>>> I would like to generate a third value that represents the quantile
>>> value of a variable in a data frame.
>>>
>>>
>>> # generating data
>>>
>>> x <- as.matrix(seq(1:30))
>>> y <- as.matrix(rnorm(30, 20, 7))
>>> tmp1 <- cbind(x,y)
>>> dat <- as.data.frame(tmp1)
>>> colnames(dat) <- c("id", "score")
>>> dat
>>>
>>> # finding percentiles of "score"
>>>
>>> qs <- as.matrix(quantile(dat$score, type=3, probs = seq(0,1,.1)))
>>> colnames(qs) <- c( "score")
>>> qs
>>>
>>> # is there a way to put the quantile value for a value of 'score'
>>> into a new variable,
>>> # such that the new data frame would have three variables: id,
>>> score
>>> and q.score?
>>>
>>> ## running R version 2.8.1 (2008-12-22) on Vista
>>>
>>>
>>> Thanks so much!
>>>
>>> -Jon
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT