Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given set of values it will have lower standard error then 0.1 quantile computed from the same set of values. Is it true? If yes can you point me to some reasoning? Thanks for all answers. Regards Petr PS. I found mcmcse package which shall compute the standard error but which I could not make to work probably because I do not have recent R-devel version installed Error in eval(expr, envir, enclos) : could not find function ".getNamespace" Error : unable to load R code in package 'mcmcse' Error: package/namespace load failed for 'mcmcse' Maybe I will also something find in quantreg package, but I did not went through it yet.
standard error for quantile
8 messages · Bert Gunter, (Ted Harding), Jim Lemon +1 more
Petr: 1. Not an R question. 2. You want the distribution of order statistics. Search on that. It's basically binomial/beta. -- Bert
On Tue, Oct 30, 2012 at 6:46 AM, PIKAL Petr <petr.pikal at precheza.cz> wrote:
Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given set of values it will have lower standard error then 0.1 quantile computed from the same set of values. Is it true? If yes can you point me to some reasoning? Thanks for all answers. Regards Petr PS. I found mcmcse package which shall compute the standard error but which I could not make to work probably because I do not have recent R-devel version installed Error in eval(expr, envir, enclos) : could not find function ".getNamespace" Error : unable to load R code in package 'mcmcse' Error: package/namespace load failed for 'mcmcse' Maybe I will also something find in quantreg package, but I did not went through it yet.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
On 30-Oct-2012 13:46:17 PIKAL Petr wrote:
Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given set of values it will have lower standard error then 0.1 quantile computed from the same set of values. Is it true? If yes can you point me to some reasoning? Thanks for all answers. Regards Petr ["PS" deleted]
The general asymptotic result for the pth quantile (0<p<1) X.p of a sample of size n is that it is asymptotically Normally distributed with mean the pth quantile Q.p of the parent distribution and var(X.p) = p*(1-p)/(n*f(Q.p)^2) where f(x) is the probability density function of the parent distribution. This is not necessarily very helpful for small sample sizes (depending on the parent distribution). However, it is possible to obtain a general result giving an exact confidence interval for Q.p given the entire ordered sample, though there is only a restricted set of confidence levels to which it applies. If you'd like more detail about the above, I could write up derivations and make the write-up available. Hoping this helps, Ted. ------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 30-Oct-2012 Time: 17:40:55 This message was sent by XFMail
On 10/31/2012 12:46 AM, PIKAL Petr wrote:
Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given set of values it will have lower standard error then 0.1 quantile computed from the same set of values. Is it true? If yes can you point me to some reasoning?
Hi Petr, Using a resampling method, it depends upon the distribution of the values. If you have a "love-hate" distribution (bimodal and heavily weighted toward extreme values), the median standard error can be larger. Try this: x<-sample(-5:5,1000,TRUE, prob=c(0.2,0.1,0.05,0.04,0.03,0.02,0.03,0.04,0.05,0.1,0.2)) x<-ifelse(x<0,x+runif(1000),x-runif(1000)) hist(x) mcse.q(x, 0.1) $est [1] -3.481419 $se [1] 0.06887319 mcse.q(x, 0.5) $est [1] 1.088475 $se [1] 0.3440115 > mcse.q(x, 0.1) $est [1] -3.481419 $se [1] 0.06887319 Jim
Thanks Jim. After reinstall of new R version all mentioned packages work. I tested various functions which revealed that on my lognorm data there is no big difference in error of median or 10% quantile. I also found some function for quantile se computing in Hmisc package. Petr
-----Original Message----- From: Jim Lemon [mailto:jim at bitwrit.com.au] Sent: Wednesday, October 31, 2012 9:56 AM To: PIKAL Petr Cc: r-help at r-project.org Subject: Re: [R] standard error for quantile On 10/31/2012 12:46 AM, PIKAL Petr wrote:
Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given set of values it will have lower standard error then 0.1 quantile computed from the same set of values. Is it true? If yes can you point me to some reasoning?
Hi Petr, Using a resampling method, it depends upon the distribution of the values. If you have a "love-hate" distribution (bimodal and heavily weighted toward extreme values), the median standard error can be larger. Try this: x<-sample(-5:5,1000,TRUE, prob=c(0.2,0.1,0.05,0.04,0.03,0.02,0.03,0.04,0.05,0.1,0.2)) x<-ifelse(x<0,x+runif(1000),x-runif(1000)) hist(x) mcse.q(x, 0.1) $est [1] -3.481419 $se [1] 0.06887319 mcse.q(x, 0.5) $est [1] 1.088475 $se [1] 0.3440115
> mcse.q(x, 0.1)
$est [1] -3.481419 $se [1] 0.06887319 Jim
Hi Bert
-----Original Message----- From: Bert Gunter [mailto:gunter.berton at gene.com] Sent: Tuesday, October 30, 2012 3:37 PM To: PIKAL Petr Cc: r-help at r-project.org Subject: Re: [R] standard error for quantile Petr: 1. Not an R question.
Partly. I asked also for pointing me to some R functions which can compute such standard error, which is by my humble opinion valid R question.
2. You want the distribution of order statistics. Search on that. It's basically binomial/beta.
Hm. I looked at some web info, which is quite good for trained statistician but at the edge of my understanding as chemist (and sometimes beyound:-). Thanks anyway. Regards Petr
-- Bert On Tue, Oct 30, 2012 at 6:46 AM, PIKAL Petr <petr.pikal at precheza.cz> wrote:
Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given set of values it will have lower standard error then 0.1 quantile computed from the same set of values. Is it true? If yes can you point me to some reasoning? Thanks for all answers. Regards Petr PS. I found mcmcse package which shall compute the standard error but which I could not make to work probably because I do not have recent R-devel version installed Error in eval(expr, envir, enclos) : could not find function ".getNamespace" Error : unable to load R code in package 'mcmcse' Error: package/namespace load failed for 'mcmcse' Maybe I will also something find in quantreg package, but I did not went through it yet.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm
Hi Ted
-----Original Message----- From: ted at deb [mailto:ted at deb] On Behalf Of Ted Harding Sent: Tuesday, October 30, 2012 6:41 PM To: r-help at r-project.org
<snip>
The general asymptotic result for the pth quantile (0<p<1) X.p of a sample of size n is that it is asymptotically Normally distributed with mean the pth quantile Q.p of the parent distribution and var(X.p) = p*(1-p)/(n*f(Q.p)^2) where f(x) is the probability density function of the parent distribution.
So if I understand correctly p*(1-p) is biggest when p=0.5 and decreases with smaller or bigger p. The var(X.p) then depends on ratio to parent distribution at this p probability. For lognorm distribution and 200 values the resulting var is
(0.5*(1-.5))/(200*qlnorm(.5, log(200), log(2))^2)
[1] 3.125e-08
(0.1*(1-.1))/(200*qlnorm(.1, log(200), log(2))^2)
[1] 6.648497e-08 so 0.1 var is slightly bigger than 0.5 var. For different distributions this can be reversed as Jim pointed out. Did I manage to understand? Thank you very much. Regards Petr
This is not necessarily very helpful for small sample sizes (depending on the parent distribution). However, it is possible to obtain a general result giving an exact confidence interval for Q.p given the entire ordered sample, though there is only a restricted set of confidence levels to which it applies. If you'd like more detail about the above, I could write up derivations and make the write-up available. Hoping this helps, Ted. ------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 30-Oct-2012 Time: 17:40:55 This message was sent by XFMail -------------------------------------------------
[see in-line below]
On 31-Oct-2012 10:26:14 PIKAL Petr wrote:
Hi Ted
-----Original Message----- From: ted at deb [mailto:ted at deb] On Behalf Of Ted Harding Sent: Tuesday, October 30, 2012 6:41 PM To: r-help at r-project.org
<snip>
The general asymptotic result for the pth quantile (0<p<1) X.p of a sample of size n is that it is asymptotically Normally distributed with mean the pth quantile Q.p of the parent distribution and var(X.p) = p*(1-p)/(n*f(Q.p)^2) where f(x) is the probability density function of the parent distribution.
So if I understand correctly p*(1-p) is biggest when p=0.5 and decreases with smaller or bigger p. The var(X.p) then depends on ratio to parent distribution at this p probability. For lognorm distribution and 200 values the resulting var is
(0.5*(1-.5))/(200*qlnorm(.5, log(200), log(2))^2)
[1] 3.125e-08
(0.1*(1-.1))/(200*qlnorm(.1, log(200), log(2))^2)
[1] 6.648497e-08 so 0.1 var is slightly bigger than 0.5 var. For different distributions this can be reversed as Jim pointed out. Did I manage to understand? Thank you very much. Regards Petr
Yes, it looks as though you understand! As a further illustration,
here is some R code applied to examples where the parent distrbution
is uniform or Normal. For each case, the reraults are stated as
first: simulated; second: by the formula. It can be seen that for
n=200 the formula and the simulations are close.
Ted.
###################################################################
## Test of formula for var(quantile)
varQ <- function(p,n,f.p) {
p*(1-p)/(n*(f.p^2))
}
## Test 1: Uniform (0,1), n = 200
n <- 200
## Pick one of (a), (b), (c):
## (a)# p <- 0.50 ; q <- 100 ; f.p <- 1
## (b)# p <- 0.25 ; q <- 50 ; f.p <- 1
## (c)# p <- 0.10 ; q <- 25 ; f.p <- 1
Nsim <- 1000
Qs <- numeric(Nsim)
for( i in (1:Nsim) ){
Qs[i] <- sort(runif(n))[q]
}
var(Qs)
varQ(p,n,f.p)
## (a) 0.001239982
## 0.00125
## (b) 0.0008877879
## 0.0009375
## (c) 0.0005619348
## 0.00045
## Test 2: N(0,1), n = 200
n <- 200
## Pick one of (a), (b), (c):
## (a)# p <- 0.50 ; q <- 100 ; f.p <- dnorm(qnorm(0.50))
## (b)# p <- 0.25 ; q <- 50 ; f.p <- dnorm(qnorm(0.25))
## (c)# p <- 0.10 ; q <- 20 ; f.p <- dnorm(qnorm(0.10))
Nsim <- 1000
Qs <- numeric(Nsim)
for( i in (1:Nsim) ){
Qs[i] <- sort(rnorm(n))[q]
}
var(Qs)
varQ(p,n,f.p)
## (a) 0.007633568
## 0.007853982
## (b) 0.009370099
## 0.009283837
## (c) 0.01420517
## 0.01461055
###################################################################
This is not necessarily very helpful for small sample sizes (depending on the parent distribution). However, it is possible to obtain a general result giving an exact confidence interval for Q.p given the entire ordered sample, though there is only a restricted set of confidence levels to which it applies. If you'd like more detail about the above, I could write up derivations and make the write-up available. Hoping this helps, Ted. ------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 30-Oct-2012 Time: 17:40:55 This message was sent by XFMail -------------------------------------------------
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 31-Oct-2012 Time: 18:10:16 This message was sent by XFMail