tlumley at u.washington.edu
2009-Dec-18  16:30 UTC
[Rd] apparently incorrect p-values from 2-sided Kolmogorov-Smirnov (PR#14158)
I've fixed this by adding 0.5/mn to q.  The problem (at least in principle)
with multiplying them all up is integer overflow.
By the time 0.5/mn underflows to zero, missing one value in the distribution
won't matter.
      -thomas
On Fri, 18 Dec 2009, David John Allwright wrote:
> Dear Thomas, Right, thank you. Yes, I haven't looked at the source code
> (because I don't know C) but something like what you mention could well
cause
> the kind of problems I am seeing: a loop being exectued one too few or one 
> too many times. And yes, I think those quantities should be multiplied up
by
> m*n to all become integers so we escape rounding error problems. David.
>
------------------------------------------------------------------------------
> On Wed, 16 Dec 2009, tlumley at u.washington.edu wrote:
>
>> On Tue, 15 Dec 2009, allwrigh at maths.ox.ac.uk wrote; (in part)
>> 
>>> 
>>> x<-1:5
>>> y<-c(2.5,4.5)
>>> ks.test(x,y)
>>> 
>>> The value of the D_2,5 statistic is calculated as 0.4 correctly,
but the
>>> p-value is stated by R as 1, though in fact it should be
20/21=0.9524
>> 
>> 
>> What we seem to have here is a rounding error problem.
>> 
>> In ks.c:psmirnov2x,  there is a double loop including
>> 	    if(fabs(i / md - j / nd) > q)
>> 		u[j] = 0;
>> 
>> where md=2, nd=5, and q=3/10.
>> 
>> Now,  to full precision  abs(1/2 - 4/5) > 3/10 is false, but at
least on my
>> MacBook it is true in C double precision.
>> 
>> I'm not sure why the loop is working with doubles, since
multiplying by m*n
>> should make everything an integer.
>> 
>>     -thomas
>> 
>> Thomas Lumley			Assoc. Professor, Biostatistics
>> tlumley at u.washington.edu	University of Washington, Seattle
>> 
>> 
>> 
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
Possibly Parallel Threads
- apparently incorrect p-values from 2-sided Kolmogorov-Smirnov (PR#14178)
- apparently incorrect p-values from 2-sided Kolmogorov-Smirnov test (PR#14145)
- apparently incorrect p-values from 2-sided Kolmogorov-Smirnov (PR#14157)
- ks.test - The two-sample two-sided Kolmogorov-Smirnov test with ties (PR#13848)
- Kolmogorov-Smirnov: calculate p value given as input the test statistic
