Skip to content

Calculation of e^{z^2/2} for a normal deviate z

15 messages · Rui Barradas, Christophe Dutang, Peter Dalgaard +6 more

#
Dear R-developers,

I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this?

Thanks very much in advance,


Jing Hua
#
Hello,

Well, try it:

p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)

pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16

exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14


p is the smallest possible such that 1 + p != 1 and I couldn't find 
anything to worry about.


R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0

locale:
  [1] LC_CTYPE=pt_PT.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=pt_PT.UTF-8        LC_COLLATE=pt_PT.UTF-8
  [5] LC_MONETARY=pt_PT.UTF-8    LC_MESSAGES=pt_PT.UTF-8
  [7] LC_PAPER=pt_PT.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
[7] base

other attached packages:

[many packages loaded]


Hope this helps,

Rui Barradas

?s 15:24 de 21/06/19, jing hua zhao escreveu:
#
Dear Rui,

Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.

Best wishes,


Jing Hua
#
Hello,

Sorry, my mistake, I grossly misunderstood the question.

qnorm(1e-300)
#[1] -37.0471

Anyway, you cannot go much smaller.

p <- 10^seq(-300, -400, by = -10)
z <- qnorm(p/2)
exp(z*z/2)


Hope this helps,

Rui Barradas

?s 16:11 de 21/06/19, jing hua zhao escreveu:
#
You should take a look at https://CRAN.R-project.org/package=Rmpfr

Regards, Christophe 

------------------------------------------------------------
Christophe Dutang
CEREMADE, Univ. Paris Dauphine, PSL Univ., France
Web : http://dutangc.free.fr
------------------------------------------------------------
#
You may want to look into using the log option to qnorm

e.g., in round figures:
[1] -690.7755
[1] -37.05315
[1] 1.881797e+297
[1] 5.314068e-298

Notice that floating point representation cuts out at 1e+/-308 or so. If you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
[1] -99.94475

-pd

  
    
#
Hi Jing,

Peter pointed out how you can, more or less, get numbers for this, and he's
absolutely right. At the risk of giving unsolicited advice, though, Im
don't think you *should* in this case.

Someone on this list with more applied Statistics or Statistical Genetics
experience can correct me if I'm wrong, but I have always been under the
impression that "pvalues" that small  (and honestly ones that are not
nearly that small) are not meaningful as actual numbers and should never be
used as such. Essentially, AFAIK, a p-value of 1e-300 just means "really
small"/"really unlikely", the same as a pvalue of 1e-280 does (even though
the second pvalue is *20 orders of magnitude* larger). Like, how
*precicely* correct
would the distributional assumptions the test is making have to be for that
-300 to be meaningful?

Note that if you have putative pvalues in the 1e-300 range, a full order of
magnitude increase in the pvalue (to 1e-299) only corresponds to only a
reduction of 0.06 in the quantile value. Is that amount of measurement
difference likely to be meaningful in a genomics setting or is it probably
machine noise? (my impression is the later).
[1] -690.7755
[1] -688.4729
[1] -37.05315
[1] -36.97216
[1] 0.06216021

 And all of the above ignores  the most important thing in practice, though
(sorry for burying the lede), which is the fact that the precision
tolerance of the computation calculating the pvalue is very likely *hundreds
of orders of magnitude *larger (ie less precise) than 1e-300 unless it uses
infinite precision arithmetic (which, I think, would be a very weird thing
to do for the reasons pointed out above).

Personally, whenever I See values on the order of 1e-K where K is bigger
than like 20 on the ouside , I just see rounding errors around 0, not
numbers that are, e.g., meaningfully comparable or sortable or summable,
etc.

So R will give you numbers for this, as Peter showed you, but personally
remain pretty skeptical that they will actually be useful for what you want
to do with them i this case.

Best,
~G
On Fri, Jun 21, 2019 at 9:32 AM peter dalgaard <pdalgd at gmail.com> wrote:

            

  
  
#
Hi Peter, Rui, Chrstophe and Gabriel,

Thanks for your inputs --  the use of qnorm(., log=TRUE) is a good point in line with pnorm with which we devised log(p)  as

log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE)

that could do really really well for large z compared to Rmpfr. Maybe I am asking too much since

z <-20000
[1] "1.660579603192917090365313727164e-86858901"

already gives a rarely seen small p value. I gather I also need a multiple precision exp() and their sum since exp(z^2/2) is also a Bayes Factor so I  get log(x_i )/sum_i log(x_i) instead. To this point, I am obliged to clarify, see https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf.

I agree many feel geneticists go to far with small p values which I would have difficulty to argue againston the other hand it is also expected to see these in a non-genetic context. For instance the Framingham study was established in 1948 just got $34m for six years on phenotypewide association which we would be interesting to see.

Best wishes,


Jing Hua
1 day later
#
I agree with many the sentiments about the wisdom of computing very
small p-values (although the example below may win some kind of a prize:
I've seen people talking about p-values of the order of 10^(-2000), but
never 10^(-(10^8)) !).  That said, there are a several tricks for
getting more reasonable sums of very small probabilities.  The first is
to scale the p-values by dividing the *largest* of the probabilities,
then do the (p/sum(p)) computation, then multiply the result (I'm sure
this is described/documented somewhere).  More generally, there are
methods for computing sums on the log scale, e.g.

https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html

 I don't know where this has been implemented in the R ecosystem, but
this sort of computation is the basis of the "Brobdingnag" package for
operating on very large ("Brobdingnagian") and very small
("Lilliputian") numbers.
On 2019-06-21 6:58 p.m., jing hua zhao wrote:
#
include/Rmath.h declares a set of 'logspace' functions for use at the C
level.  I don't think there are core R functions that call them.

/* Compute the log of a sum or difference from logs of terms, i.e.,
 *
 *     log (exp (logx) + exp (logy))
 * or  log (exp (logx) - exp (logy))
 *
 * without causing overflows or throwing away too much accuracy:
 */
double  Rf_logspace_add(double logx, double logy);
double  Rf_logspace_sub(double logx, double logy);
double  Rf_logspace_sum(const double *logx, int nx);

Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sun, Jun 23, 2019 at 1:40 AM Ben Bolker <bbolker at gmail.com> wrote:

            

  
  
#
On 22/06/2019 00:58, jing hua zhao wrote:
Another approach could be simply to note that a function defined as 
f(p)=exp(-z(p)^2/2) is regular around p=0 with f(0)=0.
It has roughly the shape of p*(2-p) for p \in [0; 1]. So we can 
calculate let say f(10^-10) with sufficient precision using Rmpfr and 
then use a linear approximation for p from [0, 10^-10]. After that a 
simple inverse gives us e^(z*z/2).

Serguei.
#
> include/Rmath.h declares a set of 'logspace' functions for use at the C
    > level.  I don't think there are core R functions that call them.

    > /* Compute the log of a sum or difference from logs of terms, i.e.,
    > *
    > *     log (exp (logx) + exp (logy))
    > * or  log (exp (logx) - exp (logy))
    > *
    > * without causing overflows or throwing away too much accuracy:
    > */
    > double  Rf_logspace_add(double logx, double logy);
    > double  Rf_logspace_sub(double logx, double logy);
    > double  Rf_logspace_sum(const double *logx, int nx);

    > Bill Dunlap
    > TIBCO Software
    > wdunlap tibco.com

Yes, indeed, thank you, Bill!

But they *have* been in use by core R functions for a long time
in pgamma, pbeta and related functions.

[and I have had changes in *hyper.c where logspace_add() is used too, 
 for several years now (since 2015) but I no longer know which
 concrete accuracy problem that addresses, so have not yet committed it]


Martin Maechler
ETH Zurich and R Core Team
> On Sun, Jun 23, 2019 at 1:40 AM Ben Bolker <bbolker at gmail.com> wrote:
>> 
    >> I agree with many the sentiments about the wisdom of computing very
    >> small p-values (although the example below may win some kind of a prize:
    >> I've seen people talking about p-values of the order of 10^(-2000), but
    >> never 10^(-(10^8)) !).  That said, there are a several tricks for
    >> getting more reasonable sums of very small probabilities.  The first is
    >> to scale the p-values by dividing the *largest* of the probabilities,
    >> then do the (p/sum(p)) computation, then multiply the result (I'm sure
    >> this is described/documented somewhere).  More generally, there are
    >> methods for computing sums on the log scale, e.g.
    >> 
    >> 
    >> https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html
    >> 
    >> I don't know where this has been implemented in the R ecosystem, but
    >> this sort of computation is the basis of the "Brobdingnag" package for
    >> operating on very large ("Brobdingnagian") and very small
    >> ("Lilliputian") numbers.
    >> 
    >>
>> On 2019-06-21 6:58 p.m., jing hua zhao wrote:
>> > Hi Peter, Rui, Chrstophe and Gabriel,
    >> >
    >> > Thanks for your inputs --  the use of qnorm(., log=TRUE) is a good point
    >> in line with pnorm with which we devised log(p)  as
    >> >
    >> > log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE)
    >> >
    >> > that could do really really well for large z compared to Rmpfr. Maybe I
    >> am asking too much since
    >> >
    >> > z <-20000
    >> >> Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
    >> > [1] "1.660579603192917090365313727164e-86858901"
    >> >
    >> > already gives a rarely seen small p value. I gather I also need a
    >> multiple precision exp() and their sum since exp(z^2/2) is also a Bayes
    >> Factor so I  get log(x_i )/sum_i log(x_i) instead. To this point, I am
    >> obliged to clarify, see
    >> https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf
    >> .
    >> >
    >> > I agree many feel geneticists go to far with small p values which I
    >> would have difficulty to argue againston the other hand it is also expected
    >> to see these in a non-genetic context. For instance the Framingham study
    >> was established in 1948 just got $34m for six years on phenotypewide
    >> association which we would be interesting to see.
    >> >
    >> > Best wishes,
    >> >
    >> >
    >> > Jing Hua
    >> >
    >> >
    >> > ________________________________
    >> > From: peter dalgaard <pdalgd at gmail.com>
    >> > Sent: 21 June 2019 16:24
    >> > To: jing hua zhao
    >> > Cc: Rui Barradas; r-devel at r-project.org
    >> > Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
    >> >
    >> > You may want to look into using the log option to qnorm
    >> >
    >> > e.g., in round figures:
    >> >
    >> >> log(1e-300)
    >> > [1] -690.7755
    >> >> qnorm(-691, log=TRUE)
    >> > [1] -37.05315
    >> >> exp(37^2/2)
    >> > [1] 1.881797e+297
    >> >> exp(-37^2/2)
    >> > [1] 5.314068e-298
    >> >
    >> > Notice that floating point representation cuts out at 1e+/-308 or so. If
    >> you want to go outside that range, you may need explicit manipulation of
    >> the log values. qnorm() itself seems quite happy with much smaller values:
    >> >
    >> >> qnorm(-5000, log=TRUE)
    >> > [1] -99.94475
    >> >
    >> > -pd
    >> >
    >> >> On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com>
>> wrote:
>> >>
    >> >> Dear Rui,
    >> >>
    >> >> Thanks for your quick reply -- this allows me to see the bottom of
    >> this. I was hoping we could have a handle of those p in genmoics such as
    >> 1e-300 or smaller.
    >> >>
    >> >> Best wishes,
    >> >>
    >> >>
    >> >> Jing Hua
    >> >>
    >> >> ________________________________
    >> >> From: Rui Barradas <ruipbarradas at sapo.pt>
    >> >> Sent: 21 June 2019 15:03
    >> >> To: jing hua zhao; r-devel at r-project.org
    >> >> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
    >> >>
    >> >> Hello,
    >> >>
    >> >> Well, try it:
    >> >>
    >> >> p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
    >> >> z <- qnorm(p/2)
    >> >>
    >> >> pnorm(z)
    >> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
    >> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
    >> >> #[11] 1.110223e-16
    >> >> p/2
    >> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
    >> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
    >> >> #[11] 1.110223e-16
    >> >>
    >> >> exp(z*z/2)
    >> >> # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
    >> >> # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
    >> >> #[11] 4.314798e+14
    >> >>
    >> >>
    >> >> p is the smallest possible such that 1 + p != 1 and I couldn't find
    >> >> anything to worry about.
    >> >>
    >> >>
    >> >> R version 3.6.0 (2019-04-26)
    >> >> Platform: x86_64-pc-linux-gnu (64-bit)
    >> >> Running under: Ubuntu 19.04
    >> >>
    >> >> Matrix products: default
    >> >> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
    >> >> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
    >> >>
    >> >> locale:
    >> >>  [1] LC_CTYPE=pt_PT.UTF-8       LC_NUMERIC=C
    >> >>  [3] LC_TIME=pt_PT.UTF-8        LC_COLLATE=pt_PT.UTF-8
    >> >>  [5] LC_MONETARY=pt_PT.UTF-8    LC_MESSAGES=pt_PT.UTF-8
    >> >>  [7] LC_PAPER=pt_PT.UTF-8       LC_NAME=C
    >> >>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
    >> >> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
    >> >>
    >> >> attached base packages:
    >> >> [1] stats     graphics  grDevices utils     datasets  methods
    >> >> [7] base
    >> >>
    >> >> other attached packages:
    >> >>
    >> >> [many packages loaded]
    >> >>
    >> >>
    >> >> Hope this helps,
    >> >>
    >> >> Rui Barradas
    >> >>
    >> >> ?s 15:24 de 21/06/19, jing hua zhao escreveu:
    >> >>> Dear R-developers,
    >> >>>
    >> >>> I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very
    >> small. I wonder if anyone has experience with this?
    >> >>>
    >> >>> Thanks very much in advance,
    >> >>>
    >> >>>
    >> >>> Jing Hua
    >> >>>
    >> >>>       [[alternative HTML version deleted]]
    >> >>>
    >> >>> ______________________________________________
    >> >>> R-devel at r-project.org mailing list
    >> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> >>>
    >> >>
    >> >>        [[alternative HTML version deleted]]
    >> >>
    >> >> ______________________________________________
    >> >> R-devel at r-project.org mailing list
    >> >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> >
    >> > --
    >> > Peter Dalgaard, Professor,
    >> > Center for Statistics, Copenhagen Business School
    >> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    >> > Phone: (+45)38153501
    >> > Office: A 4.23
    >> > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >
    >> >       [[alternative HTML version deleted]]
    >> >
    >> > ______________________________________________
    >> > R-devel at r-project.org mailing list
    >> > https://stat.ethz.ch/mailman/listinfo/r-devel
    >> >
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel
#
Hi All,

Thanks for all your comments which allows me to appreciate more of these in Python and R.

I just came across the matrixStats package,


## EXAMPLE #1
lx <- c(1000.01, 1000.02)
y0 <- log(sum(exp(lx)))
print(y0) ## Inf

y1 <- logSumExp(lx)
print(y1) ## 1000.708

and
[1] 100001000 100002000
[1] 100002000
function (lx, idxs = NULL, na.rm = FALSE, ...)
{
    has_na <- TRUE
    .Call(C_logSumExp, as.numeric(lx), idxs, as.logical(na.rm),
        has_na)
}
<bytecode: 0x20c07a8>
<environment: namespace:matrixStats>

Maybe this is rather close?

Best wishes,


Jing Hua
#
> Hi All,
    > Thanks for all your comments which allows me to appreciate more of these in Python and R.

    > I just came across the matrixStats package,


    > ## EXAMPLE #1
    > lx <- c(1000.01, 1000.02)
    > y0 <- log(sum(exp(lx)))
    > print(y0) ## Inf

    > y1 <- logSumExp(lx)
    > print(y1) ## 1000.708

    > and

    >> ly <- lx*100000
    >> ly
    > [1] 100001000 100002000
    >> y1 <- logSumExp(ly)
    >> print(y1)
    > [1] 100002000
    >> logSumExp
    > function (lx, idxs = NULL, na.rm = FALSE, ...)
    > {
    > has_na <- TRUE
    > .Call(C_logSumExp, as.numeric(lx), idxs, as.logical(na.rm),
    > has_na)
    > }
    > <bytecode: 0x20c07a8>
    > <environment: namespace:matrixStats>

    > Maybe this is rather close?

Thank you  Jing Hua,

indeed the issue of sums of (very large or very small)
exponentials is a special case, that can well be treated
specially
 - as it is not so infrequent
 - via "obvious" simplifications can be implemented even more accurately

We (authors of the R package 'copula') have had a need for
these as well, in likelihood computation for Archimedean
copulas, and
have efficient R level implementations, both for your case  and
the -- even more delicate -- case of e.g., alternating signs of
exponential terms.

"Unfortunately", we had never exported the functions from the
package, so you'd need

     copula:::lsum()   # log          sum {of exponentials in log scale}
or
     copula:::lssum()  # log *s*igned sum {of exponentials in log scale}

for the 2nd case.

The advantage is it's simple R code implemented quite
efficiently for the vector and matrices cases,

Source code from source file copula/R/special-func.R

 (in svn at R-forge :
   -->
   https://r-forge.r-project.org/scm/viewvc.php/pkg/copula/R/special-func.R?view=markup&root=copula )

{Yes, this GPL-2 licenced  {with Copyright, see file, please
      	   	 	    keep this one line}
## Copyright (C) 2012 Marius Hofert, Ivan Kojadinovic, Martin Maechler, and Jun Y
}

----------------------------------------------------------------------

##' Properly compute log(x_1 + .. + x_n) for a given (n x d)-matrix of n row
##' vectors log(x_1),..,log(x_n) (each of dimension d)
##' Here, x_i > 0  for all i
##' @title Properly compute the logarithm of a sum
##' @param lx (n,d)-matrix containing the row vectors log(x_1),..,log(x_n)
##'        each of dimension d
##' @param l.off the offset to subtract and re-add; ideally in the order of
##'        the maximum of each column
##' @return log(x_1 + .. + x_n) [i.e., OF DIMENSION d!!!] computed via
##'         log(sum(x)) = log(sum(exp(log(x))))
##'         = log(exp(log(x_max))*sum(exp(log(x)-log(x_max))))
##'         = log(x_max) + log(sum(exp(log(x)-log(x_max)))))
##'         = lx.max + log(sum(exp(lx-lx.max)))
##'         => VECTOR OF DIMENSION d
##' @author Marius Hofert, Martin Maechler
lsum <- function(lx, l.off) {
    rx <- length(d <- dim(lx))
    if(mis.off <- missing(l.off)) l.off <- {
	if(rx <= 1L)
	    max(lx)
	else if(rx == 2L)
	    apply(lx, 2L, max)
    }
    if(rx <= 1L) { ## vector
	if(is.finite(l.off))
	    l.off + log(sum(exp(lx - l.off)))
	else if(mis.off || is.na(l.off) || l.off == max(lx))
	    l.off # NA || NaN or all lx == -Inf, or max(.) == Inf
	else
	    stop("'l.off  is infinite but not == max(.)")
    } else if(rx == 2L) { ## matrix
	if(any(x.off <- !is.finite(l.off))) {
	    if(mis.off || isTRUE(all.equal(l.off, apply(lx, 2L, max)))) {
		## we know l.off = colMax(.)
		if(all(x.off)) return(l.off)
		r <- l.off
		iok <- which(!x.off)
		l.of <- l.off[iok]
		r[iok] <- l.of + log(colSums(exp(lx[,iok,drop=FALSE] -
						     rep(l.of, each=d[1]))))
		r
	    } else ## explicitly specified l.off differing from colMax(.)
		stop("'l.off' has non-finite values but differs from default max(.)")
	}
	else
	    l.off + log(colSums(exp(lx - rep(l.off, each=d[1]))))
    } else stop("not yet implemented for arrays of rank >= 3")
}

##' Properly compute log(x_1 + .. + x_n) for a given matrix of column vectors
##' log(|x_1|),.., log(|x_n|) and corresponding signs sign(x_1),.., sign(x_n)
##' Here, x_i is of arbitrary sign
##' @title compute logarithm of a sum with signed large coefficients
##' @param lxabs (d,n)-matrix containing the column vectors log(|x_1|),..,log(|x_n|)
##'        each of dimension d
##' @param signs corresponding matrix of signs sign(x_1), .., sign(x_n)
##' @param l.off the offset to subtract and re-add; ideally in the order of max(.)
##' @param strict logical indicating if it should stop on some negative sums
##' @return log(x_1 + .. + x_n) [i.e., of dimension d] computed via
##'         log(sum(x)) = log(sum(sign(x)*|x|)) = log(sum(sign(x)*exp(log(|x|))))
##'         = log(exp(log(x0))*sum(signs*exp(log(|x|)-log(x0))))
##'         = log(x0) + log(sum(signs* exp(log(|x|)-log(x0))))
##'         = l.off   + log(sum(signs* exp(lxabs -  l.off  )))
##' @author Marius Hofert and Martin Maechler
lssum <- function (lxabs, signs, l.off = apply(lxabs, 2, max), strict = TRUE) {
    stopifnot(length(dim(lxabs)) == 2L) # is.matrix(.) generalized
    sum. <- colSums(signs * exp(lxabs - rep(l.off, each=nrow(lxabs))))
    if(anyNA(sum.) || any(sum. <= 0))
        (if(strict) stop else warning)("lssum found non-positive sums")
    l.off + log(sum.)
}

----------------------------------------------------------------------

    > Best wishes,
    > Jing Hua

    > ________________________________
    > From: R-devel <r-devel-bounces at r-project.org> on behalf of Martin Maechler <maechler at stat.math.ethz.ch>
    > Sent: 24 June 2019 08:37
    > To: William Dunlap
    > Cc: r-devel at r-project.org
    > Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>> include/Rmath.h declares a set of 'logspace' functions for use at the C
    >> level.  I don't think there are core R functions that call them.

    >> /* Compute the log of a sum or difference from logs of terms, i.e.,
    >> *
    >> *     log (exp (logx) + exp (logy))
    >> * or  log (exp (logx) - exp (logy))
    >> *
    >> * without causing overflows or throwing away too much accuracy:
    >> */
    >> double  Rf_logspace_add(double logx, double logy);
    >> double  Rf_logspace_sub(double logx, double logy);
    >> double  Rf_logspace_sum(const double *logx, int nx);

    >> Bill Dunlap
    >> TIBCO Software
    >> wdunlap tibco.com

    > Yes, indeed, thank you, Bill!

    > But they *have* been in use by core R functions for a long time
    > in pgamma, pbeta and related functions.

    > [and I have had changes in *hyper.c where logspace_add() is used too,
    > for several years now (since 2015) but I no longer know which
    > concrete accuracy problem that addresses, so have not yet committed it]


    > Martin Maechler
    > ETH Zurich and R Core Team
#
Hi Martin,

Thanks for another tips -- it will be hard for me to comprehend fully your implementation of Rmpfr. I will look at these instead. I realised what I thought was a difficult problem turned to be a popular one. I managed to get a Python version here for information.

# http://bayesjumping.net/log-sum-exp-trick/
import numpy as np
def logSumExp(ns):
    max = np.max(ns)
    ds = ns - max
    sumOfExp = np.exp(ds).sum()
    return max + np.log(sumOfExp)

x = [100001000, 100002000]
logSumExp(x)

from scipy.misc import logsumexp
logsumexp(x)

import numpy as np;
prob1 = np.log(1e-50)
prob2 = np.log(2.5e-50)
prob12 = np.logaddexp(prob1, prob2)
prob12
np.exp(prob12)

As expected they are in good agreement with R.

Best regards,


Jing Hua