[Fwd: Re: pt inaccurate when x is close to 0 (PR#9945)]

Thu, Oct 11, 2007 8:10 AM #

Here's a contribution from Ian Smith that got bounced from the list.

-------- Original Message --------
Subject: Re: [Rd] pt inaccurate when x is close to 0 (PR#9945)
Date: Thu, 11 Oct 2007 06:02:43 -0400
From: iandjmsmith at aol.com
To: murdoch at stats.uwo.ca

Duncan,

I tried sending the rest of this to R-devel but it was rejected as spam, 
hence the personal e-mail.

R calculates the pt value from

nx = 1 + (x/n)*x;
val = pbeta(1./nx, n / 2., 0.5, /*lower_tail*/1, log_p);

whereas Gnumeric calculates the value as

val =? (n > x * x)
? pbeta (x * x / (n + x * x), 0.5, n / 2, /*lower_tail*/0, log_p)
: pbeta (n / (n + x * x), n / 2.0, 0.5, /*lower_tail*/1, log_p);

thus avoiding the loss of accuracy in the pbeta routine when 1-1./nx
is calculated.

It also makes the

if (n > 4e5) { /*-- Fixme(?): test should depend on `n' AND `x' ! */
??? /* Approx. from? Abramowitz & Stegun 26.7.8 (p.949) */
??? val = 1./(4.*n);
??? return pnorm(x*(1. - val)/sqrt(1. + x*x*2.*val), 0.0, 1.0,
???????? lower_tail, log_p);
}

code unneccessary.

Ian Smith

Personally, I think the code should also guard against the possible
overflow of the x * x expressions.

________________________________________________________________________
Get a FREE AOL Email account with unlimited storage.  Plus, share and 
store photos and experience exclusively recorded live music Sessions 
from your favourite artists. Find out more at 
http://info.aol.co.uk/joinnow/?ncid=548.

Martin Maechler

Fri, Oct 12, 2007 6:09 AM #

DM> Here's a contribution from Ian Smith that got bounced
    DM> from the list. 

 [well, given the obvious Spam that AOL appended at the end... ]

    DM> -------- Original Message --------
    DM> Subject: Re: [Rd] pt inaccurate when x is close to 0 (PR#9945)
    DM> Date: Thu, 11 Oct 2007 06:02:43 -0400
    DM> From: iandjmsmith at aol.com
    DM> To: murdoch at stats.uwo.ca


    DM> Duncan,

    DM> I tried sending the rest of this to R-devel but it was rejected as spam, 
    DM> hence the personal e-mail.


    DM> R calculates the pt value from


    DM> nx = 1 + (x/n)*x;
    DM> val = pbeta(1./nx, n / 2., 0.5, /*lower_tail*/1, log_p);


    DM> whereas Gnumeric calculates the value as


    DM> val =? (n > x * x)
    DM> ? pbeta (x * x / (n + x * x), 0.5, n / 2, /*lower_tail*/0, log_p)
    DM> : pbeta (n / (n + x * x), n / 2.0, 0.5, /*lower_tail*/1, log_p);

seems a good idea
        {{however I doubt the  "?"  in  "val =?" above }}

    DM> thus avoiding the loss of accuracy in the pbeta routine when 1-1./nx
    DM> is calculated.

    DM> It also makes the

    DM> if (n > 4e5) { /*-- Fixme(?): test should depend on `n' AND `x' ! */
    DM> ??? /* Approx. from? Abramowitz & Stegun 26.7.8 (p.949) */
    DM> ??? val = 1./(4.*n);
    DM> ??? return pnorm(x*(1. - val)/sqrt(1. + x*x*2.*val), 0.0, 1.0,
    DM> ???????? lower_tail, log_p);
    DM> }

    DM> code unneccessary.

probably, will have to see.

    DM> Ian Smith

    DM> Personally, I think the code should also guard against the possible
    DM> overflow of the x * x expressions.

The current code actually *does* guard 
since the overflow happens to "+Inf" and that does fulfill '> 1e100'
and the current code has

    nx = 1 + (x/n)*x;
    ....
    if(fabs(nx) > 1e100) {
    		....
    }

... 

I'll try to use the Gnumeric switch and see and think some more
about the other extreme cases.

Martin