Dear R-developers, I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this? Thanks very much in advance, Jing Hua
Calculation of e^{z^2/2} for a normal deviate z
15 messages · Rui Barradas, Christophe Dutang, Peter Dalgaard +6 more
Hello, Well, try it: p <- .Machine$double.eps^seq(0.5, 1, by = 0.05) z <- qnorm(p/2) pnorm(z) # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12 # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16 #[11] 1.110223e-16 p/2 # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12 # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16 #[11] 1.110223e-16 exp(z*z/2) # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10 # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13 #[11] 4.314798e+14 p is the smallest possible such that 1 + p != 1 and I couldn't find anything to worry about. R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 19.04 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0 locale: [1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C [3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8 [5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8 [7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [many packages loaded] Hope this helps, Rui Barradas ?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers, I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this? Thanks very much in advance, Jing Hua [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Dear Rui, Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller. Best wishes, Jing Hua
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello, Well, try it: p <- .Machine$double.eps^seq(0.5, 1, by = 0.05) z <- qnorm(p/2) pnorm(z) # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12 # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16 #[11] 1.110223e-16 p/2 # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12 # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16 #[11] 1.110223e-16 exp(z*z/2) # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10 # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13 #[11] 4.314798e+14 p is the smallest possible such that 1 + p != 1 and I couldn't find anything to worry about. R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 19.04 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0 locale: [1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C [3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8 [5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8 [7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [many packages loaded] Hope this helps, Rui Barradas ?s 15:24 de 21/06/19, jing hua zhao escreveu: > Dear R-developers, > > I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this? > > Thanks very much in advance, > > > Jing Hua > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Hello, Sorry, my mistake, I grossly misunderstood the question. qnorm(1e-300) #[1] -37.0471 Anyway, you cannot go much smaller. p <- 10^seq(-300, -400, by = -10) z <- qnorm(p/2) exp(z*z/2) Hope this helps, Rui Barradas ?s 16:11 de 21/06/19, jing hua zhao escreveu:
Dear Rui,
Thanks for your quick reply -- this allows me to see the bottom of this.
I was hoping we could have a handle of those p in genmoics such as
1e-300 or smaller.
Best wishes,
Jing Hua
------------------------------------------------------------------------
*From:* Rui Barradas <ruipbarradas at sapo.pt>
*Sent:* 21 June 2019 15:03
*To:* jing hua zhao; r-devel at r-project.org
*Subject:* Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS:?? /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
? [1] LC_CTYPE=pt_PT.UTF-8?????? LC_NUMERIC=C
? [3] LC_TIME=pt_PT.UTF-8??????? LC_COLLATE=pt_PT.UTF-8
? [5] LC_MONETARY=pt_PT.UTF-8??? LC_MESSAGES=pt_PT.UTF-8
? [7] LC_PAPER=pt_PT.UTF-8?????? LC_NAME=C
? [9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats???? graphics? grDevices utils???? datasets? methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers, I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this? Thanks very much in advance, Jing Hua ??????? [[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
You should take a look at https://CRAN.R-project.org/package=Rmpfr Regards, Christophe ------------------------------------------------------------ Christophe Dutang CEREMADE, Univ. Paris Dauphine, PSL Univ., France Web : http://dutangc.free.fr ------------------------------------------------------------
Le 21 juin 2019 ? 17:11, jing hua zhao <jinghuazhao at hotmail.com> a ?crit : Dear Rui, Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller. Best wishes, Jing Hua
________________________________
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
[5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
[7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers,
I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this?
Thanks very much in advance,
Jing Hua
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
You may want to look into using the log option to qnorm e.g., in round figures:
log(1e-300)
[1] -690.7755
qnorm(-691, log=TRUE)
[1] -37.05315
exp(37^2/2)
[1] 1.881797e+297
exp(-37^2/2)
[1] 5.314068e-298 Notice that floating point representation cuts out at 1e+/-308 or so. If you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
qnorm(-5000, log=TRUE)
[1] -99.94475 -pd
On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com> wrote: Dear Rui, Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller. Best wishes, Jing Hua
________________________________
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
[5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
[7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers,
I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this?
Thanks very much in advance,
Jing Hua
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Hi Jing, Peter pointed out how you can, more or less, get numbers for this, and he's absolutely right. At the risk of giving unsolicited advice, though, Im don't think you *should* in this case. Someone on this list with more applied Statistics or Statistical Genetics experience can correct me if I'm wrong, but I have always been under the impression that "pvalues" that small (and honestly ones that are not nearly that small) are not meaningful as actual numbers and should never be used as such. Essentially, AFAIK, a p-value of 1e-300 just means "really small"/"really unlikely", the same as a pvalue of 1e-280 does (even though the second pvalue is *20 orders of magnitude* larger). Like, how *precicely* correct would the distributional assumptions the test is making have to be for that -300 to be meaningful? Note that if you have putative pvalues in the 1e-300 range, a full order of magnitude increase in the pvalue (to 1e-299) only corresponds to only a reduction of 0.06 in the quantile value. Is that amount of measurement difference likely to be meaningful in a genomics setting or is it probably machine noise? (my impression is the later).
log(1e-300)
[1] -690.7755
log(1e-299)
[1] -688.4729
qnorm(-691, log = TRUE)
[1] -37.05315
qnorm(-688, log=TRUE)
[1] -36.97216
qnorm(-688.4729, log=TRUE) - qnorm(-690.7755, log = TRUE)
[1] 0.06216021 And all of the above ignores the most important thing in practice, though (sorry for burying the lede), which is the fact that the precision tolerance of the computation calculating the pvalue is very likely *hundreds of orders of magnitude *larger (ie less precise) than 1e-300 unless it uses infinite precision arithmetic (which, I think, would be a very weird thing to do for the reasons pointed out above). Personally, whenever I See values on the order of 1e-K where K is bigger than like 20 on the ouside , I just see rounding errors around 0, not numbers that are, e.g., meaningfully comparable or sortable or summable, etc. So R will give you numbers for this, as Peter showed you, but personally remain pretty skeptical that they will actually be useful for what you want to do with them i this case. Best, ~G
On Fri, Jun 21, 2019 at 9:32 AM peter dalgaard <pdalgd at gmail.com> wrote:
You may want to look into using the log option to qnorm e.g., in round figures:
log(1e-300)
[1] -690.7755
qnorm(-691, log=TRUE)
[1] -37.05315
exp(37^2/2)
[1] 1.881797e+297
exp(-37^2/2)
[1] 5.314068e-298 Notice that floating point representation cuts out at 1e+/-308 or so. If you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
qnorm(-5000, log=TRUE)
[1] -99.94475 -pd
On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com>
wrote:
Dear Rui, Thanks for your quick reply -- this allows me to see the bottom of this.
I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.
Best wishes, Jing Hua
________________________________
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
[5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
[7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers,
I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very
small. I wonder if anyone has experience with this?
Thanks very much in advance,
Jing Hua
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Peter, Rui, Chrstophe and Gabriel, Thanks for your inputs -- the use of qnorm(., log=TRUE) is a good point in line with pnorm with which we devised log(p) as log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE) that could do really really well for large z compared to Rmpfr. Maybe I am asking too much since z <-20000
Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
[1] "1.660579603192917090365313727164e-86858901" already gives a rarely seen small p value. I gather I also need a multiple precision exp() and their sum since exp(z^2/2) is also a Bayes Factor so I get log(x_i )/sum_i log(x_i) instead. To this point, I am obliged to clarify, see https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf. I agree many feel geneticists go to far with small p values which I would have difficulty to argue againston the other hand it is also expected to see these in a non-genetic context. For instance the Framingham study was established in 1948 just got $34m for six years on phenotypewide association which we would be interesting to see. Best wishes, Jing Hua
From: peter dalgaard <pdalgd at gmail.com>
Sent: 21 June 2019 16:24
To: jing hua zhao
Cc: Rui Barradas; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Sent: 21 June 2019 16:24
To: jing hua zhao
Cc: Rui Barradas; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
You may want to look into using the log option to qnorm
e.g., in round figures:
> log(1e-300)
[1] -690.7755
> qnorm(-691, log=TRUE)
[1] -37.05315
> exp(37^2/2)
[1] 1.881797e+297
> exp(-37^2/2)
[1] 5.314068e-298
Notice that floating point representation cuts out at 1e+/-308 or so. If you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
> qnorm(-5000, log=TRUE)
[1] -99.94475
-pd
> On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com> wrote:
>
> Dear Rui,
>
> Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.
>
> Best wishes,
>
>
> Jing Hua
>
> ________________________________
> From: Rui Barradas <ruipbarradas at sapo.pt>
> Sent: 21 June 2019 15:03
> To: jing hua zhao; r-devel at r-project.org
> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>
> Hello,
>
> Well, try it:
>
> p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
> z <- qnorm(p/2)
>
> pnorm(z)
> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
> #[11] 1.110223e-16
> p/2
> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
> #[11] 1.110223e-16
>
> exp(z*z/2)
> # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
> # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
> #[11] 4.314798e+14
>
>
> p is the smallest possible such that 1 + p != 1 and I couldn't find
> anything to worry about.
>
>
> R version 3.6.0 (2019-04-26)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 19.04
>
> Matrix products: default
> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
>
> locale:
> [1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
> [5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
> [7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
> [7] base
>
> other attached packages:
>
> [many packages loaded]
>
>
> Hope this helps,
>
> Rui Barradas
>
> ?s 15:24 de 21/06/19, jing hua zhao escreveu:
>> Dear R-developers,
>>
>> I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this?
>>
>> Thanks very much in advance,
>>
>>
>> Jing Hua
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
1 day later
I agree with many the sentiments about the wisdom of computing very small p-values (although the example below may win some kind of a prize: I've seen people talking about p-values of the order of 10^(-2000), but never 10^(-(10^8)) !). That said, there are a several tricks for getting more reasonable sums of very small probabilities. The first is to scale the p-values by dividing the *largest* of the probabilities, then do the (p/sum(p)) computation, then multiply the result (I'm sure this is described/documented somewhere). More generally, there are methods for computing sums on the log scale, e.g. https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html I don't know where this has been implemented in the R ecosystem, but this sort of computation is the basis of the "Brobdingnag" package for operating on very large ("Brobdingnagian") and very small ("Lilliputian") numbers.
On 2019-06-21 6:58 p.m., jing hua zhao wrote:
Hi Peter, Rui, Chrstophe and Gabriel, Thanks for your inputs -- the use of qnorm(., log=TRUE) is a good point in line with pnorm with which we devised log(p) as log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE) that could do really really well for large z compared to Rmpfr. Maybe I am asking too much since z <-20000
Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
[1] "1.660579603192917090365313727164e-86858901" already gives a rarely seen small p value. I gather I also need a multiple precision exp() and their sum since exp(z^2/2) is also a Bayes Factor so I get log(x_i )/sum_i log(x_i) instead. To this point, I am obliged to clarify, see https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf. I agree many feel geneticists go to far with small p values which I would have difficulty to argue againston the other hand it is also expected to see these in a non-genetic context. For instance the Framingham study was established in 1948 just got $34m for six years on phenotypewide association which we would be interesting to see. Best wishes, Jing Hua
________________________________
From: peter dalgaard <pdalgd at gmail.com>
Sent: 21 June 2019 16:24
To: jing hua zhao
Cc: Rui Barradas; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
You may want to look into using the log option to qnorm
e.g., in round figures:
log(1e-300)
[1] -690.7755
qnorm(-691, log=TRUE)
[1] -37.05315
exp(37^2/2)
[1] 1.881797e+297
exp(-37^2/2)
[1] 5.314068e-298
Notice that floating point representation cuts out at 1e+/-308 or so. If you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
qnorm(-5000, log=TRUE)
[1] -99.94475
-pd
On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com> wrote:
Dear Rui,
Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.
Best wishes,
Jing Hua
________________________________
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
[5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
[7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers,
I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this?
Thanks very much in advance,
Jing Hua
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
include/Rmath.h declares a set of 'logspace' functions for use at the C level. I don't think there are core R functions that call them. /* Compute the log of a sum or difference from logs of terms, i.e., * * log (exp (logx) + exp (logy)) * or log (exp (logx) - exp (logy)) * * without causing overflows or throwing away too much accuracy: */ double Rf_logspace_add(double logx, double logy); double Rf_logspace_sub(double logx, double logy); double Rf_logspace_sum(const double *logx, int nx); Bill Dunlap TIBCO Software wdunlap tibco.com
On Sun, Jun 23, 2019 at 1:40 AM Ben Bolker <bbolker at gmail.com> wrote:
I agree with many the sentiments about the wisdom of computing very small p-values (although the example below may win some kind of a prize: I've seen people talking about p-values of the order of 10^(-2000), but never 10^(-(10^8)) !). That said, there are a several tricks for getting more reasonable sums of very small probabilities. The first is to scale the p-values by dividing the *largest* of the probabilities, then do the (p/sum(p)) computation, then multiply the result (I'm sure this is described/documented somewhere). More generally, there are methods for computing sums on the log scale, e.g. https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html I don't know where this has been implemented in the R ecosystem, but this sort of computation is the basis of the "Brobdingnag" package for operating on very large ("Brobdingnagian") and very small ("Lilliputian") numbers. On 2019-06-21 6:58 p.m., jing hua zhao wrote:
Hi Peter, Rui, Chrstophe and Gabriel, Thanks for your inputs -- the use of qnorm(., log=TRUE) is a good point
in line with pnorm with which we devised log(p) as
log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE) that could do really really well for large z compared to Rmpfr. Maybe I
am asking too much since
z <-20000
Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
[1] "1.660579603192917090365313727164e-86858901" already gives a rarely seen small p value. I gather I also need a
multiple precision exp() and their sum since exp(z^2/2) is also a Bayes Factor so I get log(x_i )/sum_i log(x_i) instead. To this point, I am obliged to clarify, see https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf .
I agree many feel geneticists go to far with small p values which I
would have difficulty to argue againston the other hand it is also expected to see these in a non-genetic context. For instance the Framingham study was established in 1948 just got $34m for six years on phenotypewide association which we would be interesting to see.
Best wishes, Jing Hua
________________________________
From: peter dalgaard <pdalgd at gmail.com>
Sent: 21 June 2019 16:24
To: jing hua zhao
Cc: Rui Barradas; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
You may want to look into using the log option to qnorm
e.g., in round figures:
log(1e-300)
[1] -690.7755
qnorm(-691, log=TRUE)
[1] -37.05315
exp(37^2/2)
[1] 1.881797e+297
exp(-37^2/2)
[1] 5.314068e-298
Notice that floating point representation cuts out at 1e+/-308 or so. If
you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
qnorm(-5000, log=TRUE)
[1] -99.94475 -pd
On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com>
wrote:
Dear Rui, Thanks for your quick reply -- this allows me to see the bottom of
this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.
Best wishes, Jing Hua
________________________________
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
[5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
[7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers,
I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very
small. I wonder if anyone has experience with this?
Thanks very much in advance,
Jing Hua
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On 22/06/2019 00:58, jing hua zhao wrote:
Hi Peter, Rui, Chrstophe and Gabriel, Thanks for your inputs -- the use of qnorm(., log=TRUE) is a good point
Another approach could be simply to note that a function defined as f(p)=exp(-z(p)^2/2) is regular around p=0 with f(0)=0. It has roughly the shape of p*(2-p) for p \in [0; 1]. So we can calculate let say f(10^-10) with sufficient precision using Rmpfr and then use a linear approximation for p from [0, 10^-10]. After that a simple inverse gives us e^(z*z/2). Serguei.
in line with pnorm with which we devised log(p) as log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE) that could do really really well for large z compared to Rmpfr. Maybe I am asking too much since z <-20000
Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
[1] "1.660579603192917090365313727164e-86858901" already gives a rarely seen small p value. I gather I also need a multiple precision exp() and their sum since exp(z^2/2) is also a Bayes Factor so I get log(x_i )/sum_i log(x_i) instead. To this point, I am obliged to clarify, see https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf. I agree many feel geneticists go to far with small p values which I would have difficulty to argue againston the other hand it is also expected to see these in a non-genetic context. For instance the Framingham study was established in 1948 just got $34m for six years on phenotypewide association which we would be interesting to see. Best wishes, Jing Hua
________________________________
From: peter dalgaard <pdalgd at gmail.com>
Sent: 21 June 2019 16:24
To: jing hua zhao
Cc: Rui Barradas; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
You may want to look into using the log option to qnorm
e.g., in round figures:
log(1e-300)
[1] -690.7755
qnorm(-691, log=TRUE)
[1] -37.05315
exp(37^2/2)
[1] 1.881797e+297
exp(-37^2/2)
[1] 5.314068e-298
Notice that floating point representation cuts out at 1e+/-308 or so. If you want to go outside that range, you may need explicit manipulation of the log values. qnorm() itself seems quite happy with much smaller values:
qnorm(-5000, log=TRUE)
[1] -99.94475
-pd
On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com> wrote:
Dear Rui,
Thanks for your quick reply -- this allows me to see the bottom of this. I was hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.
Best wishes,
Jing Hua
________________________________
From: Rui Barradas <ruipbarradas at sapo.pt>
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Hello,
Well, try it:
p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)
pnorm(z)
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14
p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
[5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
[7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[many packages loaded]
Hope this helps,
Rui Barradas
?s 15:24 de 21/06/19, jing hua zhao escreveu:
Dear R-developers,
I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I wonder if anyone has experience with this?
Thanks very much in advance,
Jing Hua
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
[[alternative HTML version deleted]]
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
William Dunlap via R-devel
on Sun, 23 Jun 2019 10:34:47 -0700 writes:
William Dunlap via R-devel
on Sun, 23 Jun 2019 10:34:47 -0700 writes:
> include/Rmath.h declares a set of 'logspace' functions for use at the C
> level. I don't think there are core R functions that call them.
> /* Compute the log of a sum or difference from logs of terms, i.e.,
> *
> * log (exp (logx) + exp (logy))
> * or log (exp (logx) - exp (logy))
> *
> * without causing overflows or throwing away too much accuracy:
> */
> double Rf_logspace_add(double logx, double logy);
> double Rf_logspace_sub(double logx, double logy);
> double Rf_logspace_sum(const double *logx, int nx);
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
Yes, indeed, thank you, Bill!
But they *have* been in use by core R functions for a long time
in pgamma, pbeta and related functions.
[and I have had changes in *hyper.c where logspace_add() is used too,
for several years now (since 2015) but I no longer know which
concrete accuracy problem that addresses, so have not yet committed it]
Martin Maechler
ETH Zurich and R Core Team
> On Sun, Jun 23, 2019 at 1:40 AM Ben Bolker <bbolker at gmail.com> wrote:
>>
>> I agree with many the sentiments about the wisdom of computing very
>> small p-values (although the example below may win some kind of a prize:
>> I've seen people talking about p-values of the order of 10^(-2000), but
>> never 10^(-(10^8)) !). That said, there are a several tricks for
>> getting more reasonable sums of very small probabilities. The first is
>> to scale the p-values by dividing the *largest* of the probabilities,
>> then do the (p/sum(p)) computation, then multiply the result (I'm sure
>> this is described/documented somewhere). More generally, there are
>> methods for computing sums on the log scale, e.g.
>>
>>
>> https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html
>>
>> I don't know where this has been implemented in the R ecosystem, but
>> this sort of computation is the basis of the "Brobdingnag" package for
>> operating on very large ("Brobdingnagian") and very small
>> ("Lilliputian") numbers.
>>
>>
>> On 2019-06-21 6:58 p.m., jing hua zhao wrote:
>> > Hi Peter, Rui, Chrstophe and Gabriel,
>> >
>> > Thanks for your inputs -- the use of qnorm(., log=TRUE) is a good point
>> in line with pnorm with which we devised log(p) as
>> >
>> > log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE)
>> >
>> > that could do really really well for large z compared to Rmpfr. Maybe I
>> am asking too much since
>> >
>> > z <-20000
>> >> Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
>> > [1] "1.660579603192917090365313727164e-86858901"
>> >
>> > already gives a rarely seen small p value. I gather I also need a
>> multiple precision exp() and their sum since exp(z^2/2) is also a Bayes
>> Factor so I get log(x_i )/sum_i log(x_i) instead. To this point, I am
>> obliged to clarify, see
>> https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf
>> .
>> >
>> > I agree many feel geneticists go to far with small p values which I
>> would have difficulty to argue againston the other hand it is also expected
>> to see these in a non-genetic context. For instance the Framingham study
>> was established in 1948 just got $34m for six years on phenotypewide
>> association which we would be interesting to see.
>> >
>> > Best wishes,
>> >
>> >
>> > Jing Hua
>> >
>> >
>> > ________________________________
>> > From: peter dalgaard <pdalgd at gmail.com>
>> > Sent: 21 June 2019 16:24
>> > To: jing hua zhao
>> > Cc: Rui Barradas; r-devel at r-project.org
>> > Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>> >
>> > You may want to look into using the log option to qnorm
>> >
>> > e.g., in round figures:
>> >
>> >> log(1e-300)
>> > [1] -690.7755
>> >> qnorm(-691, log=TRUE)
>> > [1] -37.05315
>> >> exp(37^2/2)
>> > [1] 1.881797e+297
>> >> exp(-37^2/2)
>> > [1] 5.314068e-298
>> >
>> > Notice that floating point representation cuts out at 1e+/-308 or so. If
>> you want to go outside that range, you may need explicit manipulation of
>> the log values. qnorm() itself seems quite happy with much smaller values:
>> >
>> >> qnorm(-5000, log=TRUE)
>> > [1] -99.94475
>> >
>> > -pd
>> >
>> >> On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com>
>> wrote:
>> >>
>> >> Dear Rui,
>> >>
>> >> Thanks for your quick reply -- this allows me to see the bottom of
>> this. I was hoping we could have a handle of those p in genmoics such as
>> 1e-300 or smaller.
>> >>
>> >> Best wishes,
>> >>
>> >>
>> >> Jing Hua
>> >>
>> >> ________________________________
>> >> From: Rui Barradas <ruipbarradas at sapo.pt>
>> >> Sent: 21 June 2019 15:03
>> >> To: jing hua zhao; r-devel at r-project.org
>> >> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>> >>
>> >> Hello,
>> >>
>> >> Well, try it:
>> >>
>> >> p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
>> >> z <- qnorm(p/2)
>> >>
>> >> pnorm(z)
>> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
>> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
>> >> #[11] 1.110223e-16
>> >> p/2
>> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
>> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
>> >> #[11] 1.110223e-16
>> >>
>> >> exp(z*z/2)
>> >> # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
>> >> # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
>> >> #[11] 4.314798e+14
>> >>
>> >>
>> >> p is the smallest possible such that 1 + p != 1 and I couldn't find
>> >> anything to worry about.
>> >>
>> >>
>> >> R version 3.6.0 (2019-04-26)
>> >> Platform: x86_64-pc-linux-gnu (64-bit)
>> >> Running under: Ubuntu 19.04
>> >>
>> >> Matrix products: default
>> >> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
>> >> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
>> >>
>> >> locale:
>> >> [1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
>> >> [3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
>> >> [5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
>> >> [7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
>> >> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> >> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
>> >>
>> >> attached base packages:
>> >> [1] stats graphics grDevices utils datasets methods
>> >> [7] base
>> >>
>> >> other attached packages:
>> >>
>> >> [many packages loaded]
>> >>
>> >>
>> >> Hope this helps,
>> >>
>> >> Rui Barradas
>> >>
>> >> ?s 15:24 de 21/06/19, jing hua zhao escreveu:
>> >>> Dear R-developers,
>> >>>
>> >>> I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very
>> small. I wonder if anyone has experience with this?
>> >>>
>> >>> Thanks very much in advance,
>> >>>
>> >>>
>> >>> Jing Hua
>> >>>
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-devel at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>>
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> R-devel at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> > --
>> > Peter Dalgaard, Professor,
>> > Center for Statistics, Copenhagen Business School
>> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> > Phone: (+45)38153501
>> > Office: A 4.23
>> > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> [[alternative HTML version deleted]]
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
Hi All, Thanks for all your comments which allows me to appreciate more of these in Python and R. I just came across the matrixStats package, ## EXAMPLE #1 lx <- c(1000.01, 1000.02) y0 <- log(sum(exp(lx))) print(y0) ## Inf y1 <- logSumExp(lx) print(y1) ## 1000.708 and
ly <- lx*100000 ly
[1] 100001000 100002000
y1 <- logSumExp(ly) print(y1)
[1] 100002000
logSumExp
function (lx, idxs = NULL, na.rm = FALSE, ...)
{
has_na <- TRUE
.Call(C_logSumExp, as.numeric(lx), idxs, as.logical(na.rm),
has_na)
}
<bytecode: 0x20c07a8>
<environment: namespace:matrixStats>
Maybe this is rather close?
Best wishes,
Jing Hua
From: R-devel <r-devel-bounces at r-project.org> on behalf of Martin Maechler <maechler at stat.math.ethz.ch>
Sent: 24 June 2019 08:37
To: William Dunlap
Cc: r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Sent: 24 June 2019 08:37
To: William Dunlap
Cc: r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>>>>> William Dunlap via R-devel
>>>>> on Sun, 23 Jun 2019 10:34:47 -0700 writes:
>>>>> William Dunlap via R-devel
>>>>> on Sun, 23 Jun 2019 10:34:47 -0700 writes:
> include/Rmath.h declares a set of 'logspace' functions for use at the C
> level. I don't think there are core R functions that call them.
> /* Compute the log of a sum or difference from logs of terms, i.e.,
> *
> * log (exp (logx) + exp (logy))
> * or log (exp (logx) - exp (logy))
> *
> * without causing overflows or throwing away too much accuracy:
> */
> double Rf_logspace_add(double logx, double logy);
> double Rf_logspace_sub(double logx, double logy);
> double Rf_logspace_sum(const double *logx, int nx);
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
Yes, indeed, thank you, Bill!
But they *have* been in use by core R functions for a long time
in pgamma, pbeta and related functions.
[and I have had changes in *hyper.c where logspace_add() is used too,
for several years now (since 2015) but I no longer know which
concrete accuracy problem that addresses, so have not yet committed it]
Martin Maechler
ETH Zurich and R Core Team
> On Sun, Jun 23, 2019 at 1:40 AM Ben Bolker <bbolker at gmail.com> wrote:
>>
>> I agree with many the sentiments about the wisdom of computing very
>> small p-values (although the example below may win some kind of a prize:
>> I've seen people talking about p-values of the order of 10^(-2000), but
>> never 10^(-(10^8)) !). That said, there are a several tricks for
>> getting more reasonable sums of very small probabilities. The first is
>> to scale the p-values by dividing the *largest* of the probabilities,
>> then do the (p/sum(p)) computation, then multiply the result (I'm sure
>> this is described/documented somewhere). More generally, there are
>> methods for computing sums on the log scale, e.g.
>>
>>
>> https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.logsumexp.html
>>
>> I don't know where this has been implemented in the R ecosystem, but
>> this sort of computation is the basis of the "Brobdingnag" package for
>> operating on very large ("Brobdingnagian") and very small
>> ("Lilliputian") numbers.
>>
>>
>> On 2019-06-21 6:58 p.m., jing hua zhao wrote:
>> > Hi Peter, Rui, Chrstophe and Gabriel,
>> >
>> > Thanks for your inputs -- the use of qnorm(., log=TRUE) is a good point
>> in line with pnorm with which we devised log(p) as
>> >
>> > log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE)
>> >
>> > that could do really really well for large z compared to Rmpfr. Maybe I
>> am asking too much since
>> >
>> > z <-20000
>> >> Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))
>> > [1] "1.660579603192917090365313727164e-86858901"
>> >
>> > already gives a rarely seen small p value. I gather I also need a
>> multiple precision exp() and their sum since exp(z^2/2) is also a Bayes
>> Factor so I get log(x_i )/sum_i log(x_i) instead. To this point, I am
>> obliged to clarify, see
>> https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf
>> .
>> >
>> > I agree many feel geneticists go to far with small p values which I
>> would have difficulty to argue againston the other hand it is also expected
>> to see these in a non-genetic context. For instance the Framingham study
>> was established in 1948 just got $34m for six years on phenotypewide
>> association which we would be interesting to see.
>> >
>> > Best wishes,
>> >
>> >
>> > Jing Hua
>> >
>> >
>> > ________________________________
>> > From: peter dalgaard <pdalgd at gmail.com>
>> > Sent: 21 June 2019 16:24
>> > To: jing hua zhao
>> > Cc: Rui Barradas; r-devel at r-project.org
>> > Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>> >
>> > You may want to look into using the log option to qnorm
>> >
>> > e.g., in round figures:
>> >
>> >> log(1e-300)
>> > [1] -690.7755
>> >> qnorm(-691, log=TRUE)
>> > [1] -37.05315
>> >> exp(37^2/2)
>> > [1] 1.881797e+297
>> >> exp(-37^2/2)
>> > [1] 5.314068e-298
>> >
>> > Notice that floating point representation cuts out at 1e+/-308 or so. If
>> you want to go outside that range, you may need explicit manipulation of
>> the log values. qnorm() itself seems quite happy with much smaller values:
>> >
>> >> qnorm(-5000, log=TRUE)
>> > [1] -99.94475
>> >
>> > -pd
>> >
>> >> On 21 Jun 2019, at 17:11 , jing hua zhao <jinghuazhao at hotmail.com>
>> wrote:
>> >>
>> >> Dear Rui,
>> >>
>> >> Thanks for your quick reply -- this allows me to see the bottom of
>> this. I was hoping we could have a handle of those p in genmoics such as
>> 1e-300 or smaller.
>> >>
>> >> Best wishes,
>> >>
>> >>
>> >> Jing Hua
>> >>
>> >> ________________________________
>> >> From: Rui Barradas <ruipbarradas at sapo.pt>
>> >> Sent: 21 June 2019 15:03
>> >> To: jing hua zhao; r-devel at r-project.org
>> >> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>> >>
>> >> Hello,
>> >>
>> >> Well, try it:
>> >>
>> >> p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
>> >> z <- qnorm(p/2)
>> >>
>> >> pnorm(z)
>> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
>> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
>> >> #[11] 1.110223e-16
>> >> p/2
>> >> # [1] 7.450581e-09 1.228888e-09 2.026908e-10 3.343152e-11 5.514145e-12
>> >> # [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
>> >> #[11] 1.110223e-16
>> >>
>> >> exp(z*z/2)
>> >> # [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
>> >> # [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
>> >> #[11] 4.314798e+14
>> >>
>> >>
>> >> p is the smallest possible such that 1 + p != 1 and I couldn't find
>> >> anything to worry about.
>> >>
>> >>
>> >> R version 3.6.0 (2019-04-26)
>> >> Platform: x86_64-pc-linux-gnu (64-bit)
>> >> Running under: Ubuntu 19.04
>> >>
>> >> Matrix products: default
>> >> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
>> >> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
>> >>
>> >> locale:
>> >> [1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C
>> >> [3] LC_TIME=pt_PT.UTF-8 LC_COLLATE=pt_PT.UTF-8
>> >> [5] LC_MONETARY=pt_PT.UTF-8 LC_MESSAGES=pt_PT.UTF-8
>> >> [7] LC_PAPER=pt_PT.UTF-8 LC_NAME=C
>> >> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> >> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
>> >>
>> >> attached base packages:
>> >> [1] stats graphics grDevices utils datasets methods
>> >> [7] base
>> >>
>> >> other attached packages:
>> >>
>> >> [many packages loaded]
>> >>
>> >>
>> >> Hope this helps,
>> >>
>> >> Rui Barradas
>> >>
>> >> ?s 15:24 de 21/06/19, jing hua zhao escreveu:
>> >>> Dear R-developers,
>> >>>
>> >>> I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very
>> small. I wonder if anyone has experience with this?
>> >>>
>> >>> Thanks very much in advance,
>> >>>
>> >>>
>> >>> Jing Hua
>> >>>
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-devel at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>>
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> R-devel at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> > --
>> > Peter Dalgaard, Professor,
>> > Center for Statistics, Copenhagen Business School
>> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> > Phone: (+45)38153501
>> > Office: A 4.23
>> > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> [[alternative HTML version deleted]]
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
jing hua zhao
on Mon, 24 Jun 2019 08:51:43 +0000 writes:
> Hi All,
> Thanks for all your comments which allows me to appreciate more of these in Python and R.
> I just came across the matrixStats package,
> ## EXAMPLE #1
> lx <- c(1000.01, 1000.02)
> y0 <- log(sum(exp(lx)))
> print(y0) ## Inf
> y1 <- logSumExp(lx)
> print(y1) ## 1000.708
> and
>> ly <- lx*100000
>> ly
> [1] 100001000 100002000
>> y1 <- logSumExp(ly)
>> print(y1)
> [1] 100002000
>> logSumExp
> function (lx, idxs = NULL, na.rm = FALSE, ...)
> {
> has_na <- TRUE
> .Call(C_logSumExp, as.numeric(lx), idxs, as.logical(na.rm),
> has_na)
> }
> <bytecode: 0x20c07a8>
> <environment: namespace:matrixStats>
> Maybe this is rather close?
Thank you Jing Hua,
indeed the issue of sums of (very large or very small)
exponentials is a special case, that can well be treated
specially
- as it is not so infrequent
- via "obvious" simplifications can be implemented even more accurately
We (authors of the R package 'copula') have had a need for
these as well, in likelihood computation for Archimedean
copulas, and
have efficient R level implementations, both for your case and
the -- even more delicate -- case of e.g., alternating signs of
exponential terms.
"Unfortunately", we had never exported the functions from the
package, so you'd need
copula:::lsum() # log sum {of exponentials in log scale}
or
copula:::lssum() # log *s*igned sum {of exponentials in log scale}
for the 2nd case.
The advantage is it's simple R code implemented quite
efficiently for the vector and matrices cases,
Source code from source file copula/R/special-func.R
(in svn at R-forge :
-->
https://r-forge.r-project.org/scm/viewvc.php/pkg/copula/R/special-func.R?view=markup&root=copula )
{Yes, this GPL-2 licenced {with Copyright, see file, please
keep this one line}
## Copyright (C) 2012 Marius Hofert, Ivan Kojadinovic, Martin Maechler, and Jun Y
}
----------------------------------------------------------------------
##' Properly compute log(x_1 + .. + x_n) for a given (n x d)-matrix of n row
##' vectors log(x_1),..,log(x_n) (each of dimension d)
##' Here, x_i > 0 for all i
##' @title Properly compute the logarithm of a sum
##' @param lx (n,d)-matrix containing the row vectors log(x_1),..,log(x_n)
##' each of dimension d
##' @param l.off the offset to subtract and re-add; ideally in the order of
##' the maximum of each column
##' @return log(x_1 + .. + x_n) [i.e., OF DIMENSION d!!!] computed via
##' log(sum(x)) = log(sum(exp(log(x))))
##' = log(exp(log(x_max))*sum(exp(log(x)-log(x_max))))
##' = log(x_max) + log(sum(exp(log(x)-log(x_max)))))
##' = lx.max + log(sum(exp(lx-lx.max)))
##' => VECTOR OF DIMENSION d
##' @author Marius Hofert, Martin Maechler
lsum <- function(lx, l.off) {
rx <- length(d <- dim(lx))
if(mis.off <- missing(l.off)) l.off <- {
if(rx <= 1L)
max(lx)
else if(rx == 2L)
apply(lx, 2L, max)
}
if(rx <= 1L) { ## vector
if(is.finite(l.off))
l.off + log(sum(exp(lx - l.off)))
else if(mis.off || is.na(l.off) || l.off == max(lx))
l.off # NA || NaN or all lx == -Inf, or max(.) == Inf
else
stop("'l.off is infinite but not == max(.)")
} else if(rx == 2L) { ## matrix
if(any(x.off <- !is.finite(l.off))) {
if(mis.off || isTRUE(all.equal(l.off, apply(lx, 2L, max)))) {
## we know l.off = colMax(.)
if(all(x.off)) return(l.off)
r <- l.off
iok <- which(!x.off)
l.of <- l.off[iok]
r[iok] <- l.of + log(colSums(exp(lx[,iok,drop=FALSE] -
rep(l.of, each=d[1]))))
r
} else ## explicitly specified l.off differing from colMax(.)
stop("'l.off' has non-finite values but differs from default max(.)")
}
else
l.off + log(colSums(exp(lx - rep(l.off, each=d[1]))))
} else stop("not yet implemented for arrays of rank >= 3")
}
##' Properly compute log(x_1 + .. + x_n) for a given matrix of column vectors
##' log(|x_1|),.., log(|x_n|) and corresponding signs sign(x_1),.., sign(x_n)
##' Here, x_i is of arbitrary sign
##' @title compute logarithm of a sum with signed large coefficients
##' @param lxabs (d,n)-matrix containing the column vectors log(|x_1|),..,log(|x_n|)
##' each of dimension d
##' @param signs corresponding matrix of signs sign(x_1), .., sign(x_n)
##' @param l.off the offset to subtract and re-add; ideally in the order of max(.)
##' @param strict logical indicating if it should stop on some negative sums
##' @return log(x_1 + .. + x_n) [i.e., of dimension d] computed via
##' log(sum(x)) = log(sum(sign(x)*|x|)) = log(sum(sign(x)*exp(log(|x|))))
##' = log(exp(log(x0))*sum(signs*exp(log(|x|)-log(x0))))
##' = log(x0) + log(sum(signs* exp(log(|x|)-log(x0))))
##' = l.off + log(sum(signs* exp(lxabs - l.off )))
##' @author Marius Hofert and Martin Maechler
lssum <- function (lxabs, signs, l.off = apply(lxabs, 2, max), strict = TRUE) {
stopifnot(length(dim(lxabs)) == 2L) # is.matrix(.) generalized
sum. <- colSums(signs * exp(lxabs - rep(l.off, each=nrow(lxabs))))
if(anyNA(sum.) || any(sum. <= 0))
(if(strict) stop else warning)("lssum found non-positive sums")
l.off + log(sum.)
}
----------------------------------------------------------------------
> Best wishes,
> Jing Hua
> ________________________________
> From: R-devel <r-devel-bounces at r-project.org> on behalf of Martin Maechler <maechler at stat.math.ethz.ch>
> Sent: 24 June 2019 08:37
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
William Dunlap via R-devel
on Sun, 23 Jun 2019 10:34:47 -0700 writes:
William Dunlap via R-devel
on Sun, 23 Jun 2019 10:34:47 -0700 writes:
>> include/Rmath.h declares a set of 'logspace' functions for use at the C
>> level. I don't think there are core R functions that call them.
>> /* Compute the log of a sum or difference from logs of terms, i.e.,
>> *
>> * log (exp (logx) + exp (logy))
>> * or log (exp (logx) - exp (logy))
>> *
>> * without causing overflows or throwing away too much accuracy:
>> */
>> double Rf_logspace_add(double logx, double logy);
>> double Rf_logspace_sub(double logx, double logy);
>> double Rf_logspace_sum(const double *logx, int nx);
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
> Yes, indeed, thank you, Bill!
> But they *have* been in use by core R functions for a long time
> in pgamma, pbeta and related functions.
> [and I have had changes in *hyper.c where logspace_add() is used too,
> for several years now (since 2015) but I no longer know which
> concrete accuracy problem that addresses, so have not yet committed it]
> Martin Maechler
> ETH Zurich and R Core Team
Hi Martin, Thanks for another tips -- it will be hard for me to comprehend fully your implementation of Rmpfr. I will look at these instead. I realised what I thought was a difficult problem turned to be a popular one. I managed to get a Python version here for information. # http://bayesjumping.net/log-sum-exp-trick/ import numpy as np def logSumExp(ns): max = np.max(ns) ds = ns - max sumOfExp = np.exp(ds).sum() return max + np.log(sumOfExp) x = [100001000, 100002000] logSumExp(x) from scipy.misc import logsumexp logsumexp(x) import numpy as np; prob1 = np.log(1e-50) prob2 = np.log(2.5e-50) prob12 = np.logaddexp(prob1, prob2) prob12 np.exp(prob12) As expected they are in good agreement with R. Best regards, Jing Hua
From: Martin Maechler <maechler at stat.math.ethz.ch>
Sent: 24 June 2019 09:29
To: jing hua zhao
Cc: William Dunlap; Martin Maechler; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
Sent: 24 June 2019 09:29
To: jing hua zhao
Cc: William Dunlap; Martin Maechler; r-devel at r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>>>>> jing hua zhao
>>>>> on Mon, 24 Jun 2019 08:51:43 +0000 writes:
> Hi All,
> Thanks for all your comments which allows me to appreciate more of these in Python and R.
> I just came across the matrixStats package,
> ## EXAMPLE #1
> lx <- c(1000.01, 1000.02)
> y0 <- log(sum(exp(lx)))
> print(y0) ## Inf
> y1 <- logSumExp(lx)
> print(y1) ## 1000.708
> and
>> ly <- lx*100000
>> ly
> [1] 100001000 100002000
>> y1 <- logSumExp(ly)
>> print(y1)
> [1] 100002000
>> logSumExp
> function (lx, idxs = NULL, na.rm = FALSE, ...)
> {
> has_na <- TRUE
> .Call(C_logSumExp, as.numeric(lx), idxs, as.logical(na.rm),
> has_na)
> }
> <bytecode: 0x20c07a8>
> <environment: namespace:matrixStats>
> Maybe this is rather close?
Thank you Jing Hua,
indeed the issue of sums of (very large or very small)
exponentials is a special case, that can well be treated
specially
- as it is not so infrequent
- via "obvious" simplifications can be implemented even more accurately
We (authors of the R package 'copula') have had a need for
these as well, in likelihood computation for Archimedean
copulas, and
have efficient R level implementations, both for your case and
the -- even more delicate -- case of e.g., alternating signs of
exponential terms.
"Unfortunately", we had never exported the functions from the
package, so you'd need
copula:::lsum() # log sum {of exponentials in log scale}
or
copula:::lssum() # log *s*igned sum {of exponentials in log scale}
for the 2nd case.
The advantage is it's simple R code implemented quite
efficiently for the vector and matrices cases,
Source code from source file copula/R/special-func.R
(in svn at R-forge :
-->
https://r-forge.r-project.org/scm/viewvc.php/pkg/copula/R/special-func.R?view=markup&root=copula )
{Yes, this GPL-2 licenced {with Copyright, see file, please
keep this one line}
## Copyright (C) 2012 Marius Hofert, Ivan Kojadinovic, Martin Maechler, and Jun Y
}
----------------------------------------------------------------------
##' Properly compute log(x_1 + .. + x_n) for a given (n x d)-matrix of n row
##' vectors log(x_1),..,log(x_n) (each of dimension d)
##' Here, x_i > 0 for all i
##' @title Properly compute the logarithm of a sum
##' @param lx (n,d)-matrix containing the row vectors log(x_1),..,log(x_n)
##' each of dimension d
##' @param l.off the offset to subtract and re-add; ideally in the order of
##' the maximum of each column
##' @return log(x_1 + .. + x_n) [i.e., OF DIMENSION d!!!] computed via
##' log(sum(x)) = log(sum(exp(log(x))))
##' = log(exp(log(x_max))*sum(exp(log(x)-log(x_max))))
##' = log(x_max) + log(sum(exp(log(x)-log(x_max)))))
##' = lx.max + log(sum(exp(lx-lx.max)))
##' => VECTOR OF DIMENSION d
##' @author Marius Hofert, Martin Maechler
lsum <- function(lx, l.off) {
rx <- length(d <- dim(lx))
if(mis.off <- missing(l.off)) l.off <- {
if(rx <= 1L)
max(lx)
else if(rx == 2L)
apply(lx, 2L, max)
}
if(rx <= 1L) { ## vector
if(is.finite(l.off))
l.off + log(sum(exp(lx - l.off)))
else if(mis.off || is.na(l.off) || l.off == max(lx))
l.off # NA || NaN or all lx == -Inf, or max(.) == Inf
else
stop("'l.off is infinite but not == max(.)")
} else if(rx == 2L) { ## matrix
if(any(x.off <- !is.finite(l.off))) {
if(mis.off || isTRUE(all.equal(l.off, apply(lx, 2L, max)))) {
## we know l.off = colMax(.)
if(all(x.off)) return(l.off)
r <- l.off
iok <- which(!x.off)
l.of <- l.off[iok]
r[iok] <- l.of + log(colSums(exp(lx[,iok,drop=FALSE] -
rep(l.of, each=d[1]))))
r
} else ## explicitly specified l.off differing from colMax(.)
stop("'l.off' has non-finite values but differs from default max(.)")
}
else
l.off + log(colSums(exp(lx - rep(l.off, each=d[1]))))
} else stop("not yet implemented for arrays of rank >= 3")
}
##' Properly compute log(x_1 + .. + x_n) for a given matrix of column vectors
##' log(|x_1|),.., log(|x_n|) and corresponding signs sign(x_1),.., sign(x_n)
##' Here, x_i is of arbitrary sign
##' @title compute logarithm of a sum with signed large coefficients
##' @param lxabs (d,n)-matrix containing the column vectors log(|x_1|),..,log(|x_n|)
##' each of dimension d
##' @param signs corresponding matrix of signs sign(x_1), .., sign(x_n)
##' @param l.off the offset to subtract and re-add; ideally in the order of max(.)
##' @param strict logical indicating if it should stop on some negative sums
##' @return log(x_1 + .. + x_n) [i.e., of dimension d] computed via
##' log(sum(x)) = log(sum(sign(x)*|x|)) = log(sum(sign(x)*exp(log(|x|))))
##' = log(exp(log(x0))*sum(signs*exp(log(|x|)-log(x0))))
##' = log(x0) + log(sum(signs* exp(log(|x|)-log(x0))))
##' = l.off + log(sum(signs* exp(lxabs - l.off )))
##' @author Marius Hofert and Martin Maechler
lssum <- function (lxabs, signs, l.off = apply(lxabs, 2, max), strict = TRUE) {
stopifnot(length(dim(lxabs)) == 2L) # is.matrix(.) generalized
sum. <- colSums(signs * exp(lxabs - rep(l.off, each=nrow(lxabs))))
if(anyNA(sum.) || any(sum. <= 0))
(if(strict) stop else warning)("lssum found non-positive sums")
l.off + log(sum.)
}
----------------------------------------------------------------------
> Best wishes,
> Jing Hua
> ________________________________
> From: R-devel <r-devel-bounces at r-project.org> on behalf of Martin Maechler <maechler at stat.math.ethz.ch>
> Sent: 24 June 2019 08:37
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z
>>>>> William Dunlap via R-devel
>>>>> on Sun, 23 Jun 2019 10:34:47 -0700 writes:
>>>>> William Dunlap via R-devel
>>>>> on Sun, 23 Jun 2019 10:34:47 -0700 writes:
>> include/Rmath.h declares a set of 'logspace' functions for use at the C
>> level. I don't think there are core R functions that call them.
>> /* Compute the log of a sum or difference from logs of terms, i.e.,
>> *
>> * log (exp (logx) + exp (logy))
>> * or log (exp (logx) - exp (logy))
>> *
>> * without causing overflows or throwing away too much accuracy:
>> */
>> double Rf_logspace_add(double logx, double logy);
>> double Rf_logspace_sub(double logx, double logy);
>> double Rf_logspace_sum(const double *logx, int nx);
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
[[elided Hotmail spam]]
> But they *have* been in use by core R functions for a long time
> in pgamma, pbeta and related functions.
> [and I have had changes in *hyper.c where logspace_add() is used too,
> for several years now (since 2015) but I no longer know which
> concrete accuracy problem that addresses, so have not yet committed it]
> Martin Maechler
> ETH Zurich and R Core Team