Hi everybody, while performing ks.test for a standard exponential distribution on samples of dimension 2500, generated everytime as new, i had this strange behaviour:>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.0147, p-value = 0.6549 alternative hypothesis: two.sided>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.019, p-value = 0.3305 alternative hypothesis: two.sided>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.0171, p-value = 0.4580 alternative hypothesis: two.sided>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.0143, p-value = 0.6841 alternative hypothesis: two.sided>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.0145, p-value = 0.6684 alternative hypothesis: two.sided>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.0123, p-value = 0.8435 alternative hypothesis: two.sided>data<-rexp(2500,0.4) >ks.test(data,"pexp",0.4)One-sample Kolmogorov-Smirnov test data: data D = 0.0186, p-value = 0.3532 alternative hypothesis: two.sided It seems kind of strange to me that max p-value obtained is 0.8435 and all the best i can have from the rest is a 0.66-0.68. I'm probably not so expert in running this kind of test, but am I doing something wrong? I would expect p values ranging from 0.75 (to be kind) to 0.9, 0.95. How is this possible? Thank you in advance for your answers. See you soon EM
7 repetitions is not nearly enough to get a good estimate of the variability of the test statistic. Try this: nrep <- 500 pvals <- tstvals <- numeric(nrep) for (i in seq(nrep)) { tmp <- ks.test(rexp(2500,0.4),"pexp",0.4) pvals[i] <- tmp$p.value tstvals[i] <- tmp$statistic } hist(pvals) hist(tstvals) round(quantile(pvals,pr=seq(0.05,.95,.05)),2) At 2:36 PM +0000 2/3/06, Emanuele Mazzola wrote:>Hi everybody, > >while performing ks.test for a standard exponential distribution on samples >of dimension 2500, generated everytime as new, i had this strange behaviour: > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0147, p-value = 0.6549 >alternative hypothesis: two.sided > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.019, p-value = 0.3305 >alternative hypothesis: two.sided > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0171, p-value = 0.4580 >alternative hypothesis: two.sided > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0143, p-value = 0.6841 >alternative hypothesis: two.sided > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0145, p-value = 0.6684 >alternative hypothesis: two.sided > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0123, p-value = 0.8435 >alternative hypothesis: two.sided > > >data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0186, p-value = 0.3532 >alternative hypothesis: two.sided > > >It seems kind of strange to me that max p-value obtained is 0.8435 and all >the best i can have from the rest is a 0.66-0.68. >I'm probably not so expert in running this kind of test, but am I doing >something wrong? >I would expect p values ranging from 0.75 (to be kind) to 0.9, 0.95. How is >this possible? > >Thank you in advance for your answers. >See you soon >EM > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- -------------------------------------- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA
The distribution of p-values should be uniform under the null hypothesis. When I do: > jj <- numeric(10000) > for(i in 1:10000) jj[i] <- ks.test(rexp(2500, .4), 'pexp', .4)$p.value Warning messages: 1: cannot compute correct p-values with ties in: ks.test(rexp(2500, 0.4), "pexp", 0.4) 2: cannot compute correct p-values with ties in: ks.test(rexp(2500, 0.4), "pexp", 0.4) 3: cannot compute correct p-values with ties in: ks.test(rexp(2500, 0.4), "pexp", 0.4) 4: cannot compute correct p-values with ties in: ks.test(rexp(2500, 0.4), "pexp", 0.4) > hist(jj, 50, col='yellow'); abline(h=200, col='green') I get a histogram that looks reasonably flat to me. Patrick Burns patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User") Emanuele Mazzola wrote:>Hi everybody, > >while performing ks.test for a standard exponential distribution on samples >of dimension 2500, generated everytime as new, i had this strange behaviour: > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0147, p-value = 0.6549 >alternative hypothesis: two.sided > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.019, p-value = 0.3305 >alternative hypothesis: two.sided > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0171, p-value = 0.4580 >alternative hypothesis: two.sided > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0143, p-value = 0.6841 >alternative hypothesis: two.sided > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0145, p-value = 0.6684 >alternative hypothesis: two.sided > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0123, p-value = 0.8435 >alternative hypothesis: two.sided > > > >>data<-rexp(2500,0.4) >>ks.test(data,"pexp",0.4) >> >> > > One-sample Kolmogorov-Smirnov test > >data: data >D = 0.0186, p-value = 0.3532 >alternative hypothesis: two.sided > > >It seems kind of strange to me that max p-value obtained is 0.8435 and all >the best i can have from the rest is a 0.66-0.68. >I'm probably not so expert in running this kind of test, but am I doing >something wrong? >I would expect p values ranging from 0.75 (to be kind) to 0.9, 0.95. How is >this possible? > >Thank you in advance for your answers. >See you soon >EM > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > >