Skip to content

Exactness of ppois

2 messages · Matthias Kohl, Martin Maechler

#
Hello,

by checking the precision of a convolution algorithm, we found the 
following "inexactness":
We work with R Version 1.8.1 (2003-11-21) on Windows systems (NT, 2000, 
XP).

Try the code:
## Kolmogorov distance between two methods to
## determine P(Poisson(lambda)<=x)
Kolm.dist <- function(lam, eps){
  x <- seq(0,qpois(1-eps, lambda=lam), by=1)
  max(abs(ppois(x, lambda=lam)-cumsum(dpois(x, lambda=lam))))
}
erg<-optimize(Kolm.dist, lower=900, upper=1000, maximum=TRUE, eps=1e-15)
erg

Kolm1.dist <- function(lam, eps){
  x <- seq(0,qpois(1-eps, lambda=lam), by=1)
  which.max(abs(ppois(x, lambda=lam)-cumsum(dpois(x, lambda=lam))))
}
Kolm1.dist(lam=erg$max, eps=1e-15)

So for lambda=977.8 and x=1001 we get a distance of about 5.2e-06.
(This inexactness seems to hold for all lambda values greater than about 
900.)

BUT, summing about 1000 terms of exactness around 1e-16,
we would expect an error of order 1e-13.

We suspect algorithm AS 239 to cause that flaw.
Do you think this could cause other problems apart from
that admittedly extreme example?

Thanks for your attention!
Matthias
#
Matthias> Hello, by checking the precision of a convolution
    Matthias> algorithm, we found the following "inexactness":
    Matthias> We work with R Version 1.8.1 (2003-11-21) on
    Matthias> Windows systems (NT, 2000, XP).

    Matthias> Try the code:

    Matthias> So for lambda=977.8 and x=1001 we get a distance
    Matthias> of about 5.2e-06.  (This inexactness seems to hold
    Matthias> for all lambda values greater than about 900.)

    Matthias> BUT, summing about 1000 terms of exactness around 1e-16,
    Matthias> we would expect an error of order 1e-13.

    Matthias> We suspect algorithm AS 239 to cause that flaw.

correct.   Namely, because

 ppois(x, lambda, lower_tail, log_p) :=  
    pgamma(lambda, x + 1, 1., !lower_tail, log_p)

and pgamma(x, alph, scale) uses AS 239, currently. 
So this thread is really about the precision of R's current pgamma().

In your example, (x = 977.8, alph = 1002, scale=1) and 
in pgamma.c,
    alphlimit = 1000;
and later

    /* use a normal approximation if alph > alphlimit */
    if (alph > alphlimit) {
	pn1 = sqrt(alph) * 3. * (pow(x/alph, 1./3.) + 1. / (9. * alph) - 1.);
	return pnorm(pn1, 0., 1., lower_tail, log_p);
    }

So, we could conceivably 
improve the situation by increasing `alphlimit'.
Though, I don't see a real need for this (and it will cost CPU
cycles in these cases).

    Matthias> Do you think this could cause other problems apart
    Matthias> from that admittedly extreme example?

no, I don't think.  Look at

  > lam <- 977.8
  > (p1 <- ppois(1001, lam))
  [1] 0.77643705
  > (p2 <- sum(dpois(0:1001, lam)))
  [1] 0.77643187

Can you imagine a situation where this difference matters?

    Matthias> Thanks for your attention!
You're welcome.

Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><