Dear Thomas, Thank you, yes, that sounds good, and I take the point about integer overflow. Various questions: (a) Is there some way I can try out the routine with this modification? (I am on a Linux system where I am just a user - I cannot install new versions of software myself) ? (b) Is there a reference you can give me to a published paper where the method being used to compute the p-values is described? Many thanks, David. ------------------------------------------------------------------------------
On Fri, 18 Dec 2009, tlumley at u.washington.edu wrote:
I've fixed this by adding 0.5/mn to q. The problem (at least in principle)
with multiplying them all up is integer overflow.
By the time 0.5/mn underflows to zero, missing one value in the distribution
won't matter.
-thomas
On Fri, 18 Dec 2009, David John Allwright wrote:
Dear Thomas, Right, thank you. Yes, I haven't looked at the source code (because I don't know C) but something like what you mention could well cause the kind of problems I am seeing: a loop being exectued one too few or one too many times. And yes, I think those quantities should be multiplied up by m*n to all become integers so we escape rounding error problems. David. ------------------------------------------------------------------------------ On Wed, 16 Dec 2009, tlumley at u.washington.edu wrote:
On Tue, 15 Dec 2009, allwrigh at maths.ox.ac.uk wrote; (in part)
x<-1:5 y<-c(2.5,4.5) ks.test(x,y) The value of the D_2,5 statistic is calculated as 0.4 correctly, but the p-value is stated by R as 1, though in fact it should be 20/21=0.9524
What we seem to have here is a rounding error problem.
In ks.c:psmirnov2x, there is a double loop including
if(fabs(i / md - j / nd) > q)
u[j] = 0;
where md=2, nd=5, and q=3/10.
Now, to full precision abs(1/2 - 4/5) > 3/10 is false, but at least on
my MacBook it is true in C double precision.
I'm not sure why the loop is working with doubles, since multiplying by
m*n should make everything an integer.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle