results of pnorm as either NaN or Inf
Thank you for your responses. I presented a minimal example of the issue, but I should have explained that this came up in the context of maximizing a log likelihood function (with optim). I certainly agree that there would be no good reason for a human to evaluate the function pnorm(-x, log.p=TRUE) for large x. However, I would suggest that one may want the function to return a value of -Inf for large x, rather than issue an error or a warning, as I'm guessing your alteration of the C code would do. Returning a value of -Inf would allow an optimization routine (or a pre-optimization-routine grid search) to know that it's going into (or is currently in) the wrong region. I can definitely see the counterargument that someone using an optimization routine should think about the function they are optimizing more carefully, especially given that there could be numerical issues. I think I'm not the right person to judge, both because I don't know the programming issues nor much about numerical optimization, but I wanted to offer a possible argument for returning -Inf in case it hasn't already been considered, and others can evaluate the issue. (In general, my earlier purpose was just to point out what I thought might be an inconsistency in case people wanted to change it. I'm not concerned about it for my own usage at all, since I can just adjust my own programs.) Thank you, and to all, thank you for all of your work on R. Eric On Fri, 14 May 2010 11:50:09 +0100 (BST), Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
The answer to pnorm(-x, log.p=TRUE) is of course about -0.5*x*x for large x. So trying to compute it for -1e108 is a bit silly, both on your part and the C code used. I've altered the C code not to attempt it for x > 1e170, which excludes the area where underflow occurs. On Fri, 14 May 2010, Ted.Harding at manchester.ac.uk wrote:
On 13-May-10 20:04:50, efreeman at berkeley.edu wrote:
I stumbled across this and I am wondering if this is unexpected behavior or if I am missing something.
pnorm(-1.0e+307, log.p=TRUE)
[1] -Inf
pnorm(-1.0e+308, log.p=TRUE)
[1] NaN Warning message: In pnorm(q, mean, sd, lower.tail, log.p) : NaNs produced
pnorm(-1.0e+309, log.p=TRUE)
[1] -Inf I don't know C and am not that skilled with R, so it would be hard for me to look into the code for pnorm. If I'm not just missing something, I thought it may be of interest. Details: I am using Mac OS X 10.5.8. I installed a precompiled binary version. Here is the output from sessionInfo(), requested in the posting guide: R version 2.11.0 (2010-04-22) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.11.0 Thank you very much, Eric Freeman UC Berkeley
This is probably platform-independent. I get the same results with R on Linux. More to the point: You are clearly "pushing the envelope" here. First, have a look at what R makes of your inputs to pnorm(): -1.0e+307 # [1] -1e+307 -1.0e+308 # [1] -1e+308 -1.0e+309 # [1] -Inf So, somewhere beteen -1e+308 and -1.0e+309. the envelope burst! Given -1.0e+309, R returns -Inf (i.e. R can no longer represent this internally as a finite number). Now look at pnorm(-Inf,log.p=TRUE) # [1] -Inf So, R knows how to give the correct answer (an exact 0, or -Inf on the log scale) if you feed pnorm() with -Inf. So you're OK with -1e+N where N >= 309. For smaller powers, e.g. -1e+(200:306), these give pnorm() much less than -1.0e+309, and presumably R's algorithm (which I haven't studied either ... ) returns 0 for pnorm(), as it should to the available internal accuracy. So, up to pnorm(-1.0e+307, log.p=TRUE) = -Inf. All is as it should be. However, at -1e+308, "the envelope is about to burst", and something may occur within the algorithm which results in a NaN. So there is nothing anomalous about your results except at -1e+308, which is where R is at a critical point. That's how I see it, anway! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 14-May-10 Time: 00:01:27 ------------------------------ XFMail ------------------------------
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel