how R implement qnorm()

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121017/ea9c66ed/attachment.pl>
how R implement qnorm()

I wonder anyone knows the mathematical process that R calculated the
quantile?
It's on the help page!

'For qnorm, the code is a C translation of

Wichura, M. J. (1988) Algorithm AS 241: The Percentage Points of the 
Normal Distribution. Applied Statistics, 37, 477?484.'

and the exact code is in the R sources.  E.g. online at 
https://svn.r-project.org/R/trunk/src/nmath/qnorm.c
The reason I asked is soly by curiosity. I know the probability of a normal
distribution is calculated through integrate the Gaussian function, which
can be implemented easily (see code),  while the calculation of quantile
(or Z??) in R is a bit confusing as it requires inverse error function (X =
- sqrt(2)* erf-1 (2*P)), while R doesn't have a build in one. The InvErf
function most people use is through qnorm( InvErf=function(x)
I think you are wrong about 'most people': this is the notation used by 
a small group of non-statisticians (mainly physicists, I think).
qnorm((1+x)/2)/sqrt(2) ). When you type qnorm in the console, it doesn't
show it as it is an internal function, I searched around can't found too
much information, my hunch is R might be using some internal library that's
in the chipset which can calculate erf-1(x), but it is not accessible to
user.
No, FPUs do not commonly have inverse erf functions.
Any information is welcomed. thanks.

Sheng

code for implementation of pnorm()
--------------------------------------------------

p.Gaussian=function (z, mean=0,sd=1) {

Gaussian=function(x) {1/(sqrt(2*pi)*sd)*exp(-(x-mean)^2/(2*sd^2))}
per=integrate(Gaussian,lower=-Inf,upper=z)

return (per$value)
}

code for implementation of qnorm()
--------------------------------------------------
# I've figured out one that uses the uniroot function to get x, it
approximate qnorm() well but not exactly. I would be very happy to see the
implementation through a mathematical formula such as using the X = -
sqrt(2)* erf-1 (2*P), P is the probability).

q.Gaussian=function(p,mean=0,sd=1) {
    variable = function(x) p.Gaussian(x)-p
    z = uniroot(variable, interval=c(-4,4))
    v = z$root*sd+mean
    return(v)

}

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

R is a bit confusing as it requires inverse error function (X =
- sqrt(2)* erf-1 (2*P)), while R doesn't have a build in one. The InvErf
function most people use is through qnorm( InvErf=function(x)
I think you are wrong about 'most people': this is the notation used by a small group of non-statisticians (mainly physicists, I think).
Well, he's right in the sense that it is erf/erfc that are commonly documented in collections of special functions (like Abramowitz & Stegun), and also those that appear in common C math libraries. In both cases of course because physicists have dominated in their development. 

Of course "most people" use Excel these days (which has the inverse normal distribution but AFAICS not the inverse error function).
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121018/f96f63c0/attachment.pl>
Thanks a lot. It's very helpful.
I've read through the c code. Just FYI and for my completion of the
question, I post some of my thought on it:
To me it looks like the algorithm is actually inquiring through an
approximation table (the approximations, at least for pnom, is "derived
from those in "Rational Chebyshev approximations for the error function" by
W. J. Cody, Math. Comp., 1969, 631-637."), it is not using any function
such as intergration or inverse erf to compute the value based on an
equation as I thought, though lack a bit subtlety but the up side is the
speed.
It's not an approximation table, it's a rational Chebyshev
approximation, a ratio of polynomials.  qnorm also uses a rational
polynomial approximation.

At some level this *has to be* what is going on: computers don't
implement pnorm/qnorm or erfc in hardware, and there is no closed-form
expression for them in terms of quantities that are implemented in
hardware, so the functions must be some sort of approximate expression
using arithmetic and other hardware computations.
From the point of view of R, the only relevant issues are precision,
speed, and portability, and these approximations are portable,
accurate, and fast.

    -thomas
Hope this can help others who had similar questions as well.

Sheng

On Thu, Oct 18, 2012 at 2:32 AM, peter dalgaard <pdalgd at gmail.com> wrote:

On Oct 18, 2012, at 09:55 , Prof Brian Ripley wrote:

R is a bit confusing as it requires inverse error function (X =
- sqrt(2)* erf-1 (2*P)), while R doesn't have a build in one. The InvErf
function most people use is through qnorm( InvErf=function(x)
I think you are wrong about 'most people': this is the notation used by
a small group of non-statisticians (mainly physicists, I think).

Well, he's right in the sense that it is erf/erfc that are commonly
documented in collections of special functions (like Abramowitz & Stegun),
and also those that appear in common C math libraries. In both cases of
course because physicists have dominated in their development.

Of course "most people" use Excel these days (which has the inverse normal
distribution but AFAICS not the inverse error function).

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Thomas Lumley
Professor of Biostatistics
University of Auckland

On Fri, Oct 19, 2012 at 12:21 PM, Sheng Liu <sheng.liu at live.ca> wrote:
Thanks a lot. It's very helpful.
I've read through the c code. Just FYI and for my completion of the
question, I post some of my thought on it:
To me it looks like the algorithm is actually inquiring through an
approximation table (the approximations, at least for pnom, is "derived
from those in "Rational Chebyshev approximations for the error function" by
W. J. Cody, Math. Comp., 1969, 631-637."), it is not using any function
such as intergration or inverse erf to compute the value based on an
equation as I thought, though lack a bit subtlety but the up side is the
speed.
It's not an approximation table, it's a rational Chebyshev
approximation, a ratio of polynomials.  qnorm also uses a rational
polynomial approximation.
I inferred that Sheng probably knew that and meant that there's a table of coefficients of a rational polynomial approximation.
At some level this *has to be* what is going on: computers don't
implement pnorm/qnorm or erfc in hardware, and there is no closed-form
expression for them in terms of quantities that are implemented in
hardware, so the functions must be some sort of approximate expression
using arithmetic and other hardware computations.
Up to a few years ago, or maybe a decade, that was also the way special functions and even square roots and division were implemented in hardware using lookup tables and add/multiply circuits. All hell broke loose when one of the coefficients got entered incorrectly -- some may remember the FDIV Pentium bug of 1994. I used to have a really nice paper from Byte magazine c. 1990 which detailed the process, but I think it got lost in space. (L. Brett Glass: "Math Coprocessors" could be the one. http://dl.acm.org/citation.cfm?id=86680). 

Nowadays, we have single CPU cycle transcendentals, so I suppose that has all been reduced to hard-core optimized electronic circuitry. Fringe-market customers like statisticians still need to implement their functions in software, of course.
From the point of view of R, the only relevant issues are precision,
speed, and portability, and these approximations are portable,
accurate, and fast.

   -thomas

Hope this can help others who had similar questions as well.

Sheng

On Thu, Oct 18, 2012 at 2:32 AM, peter dalgaard <pdalgd at gmail.com> wrote:

On Oct 18, 2012, at 09:55 , Prof Brian Ripley wrote:

R is a bit confusing as it requires inverse error function (X =
- sqrt(2)* erf-1 (2*P)), while R doesn't have a build in one. The InvErf
function most people use is through qnorm( InvErf=function(x)
I think you are wrong about 'most people': this is the notation used by
a small group of non-statisticians (mainly physicists, I think).

Well, he's right in the sense that it is erf/erfc that are commonly
documented in collections of special functions (like Abramowitz & Stegun),
and also those that appear in common C math libraries. In both cases of
course because physicists have dominated in their development.

Of course "most people" use Excel these days (which has the inverse normal
distribution but AFAICS not the inverse error function).

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

       [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121019/6fe40126/attachment.pl>