Hi, I was trying to draw an empirical distribution function with uniform confidence bands. So I tried to find a way to calculate values of the Kolmogorov-Smirnov Distribution but failed. I guess it must be hidden somewhere (since the ks-test is implemented), but I was unable to find it. Is there any way to do this? Thanks Leif Boysen
kjetil brinchmann halvorsen
2003-Jul-21 21:23 UTC
[R] Confidence Band for empirical distribution function
On 21 Jul 2003 at 15:42, Leif.Boysen wrote:
Here are some functions doing this using the package stepfun:
ecdf.ksCI <- function(x, main = NULL, sub = NULL,
xlab = deparse(substitute(x)), ...)
{
require(stepfun)
xlab
if(is.null(main))
main <- paste("ecdf(",deparse(substitute(x)),") + 95%
K.S.bands",
sep="")
n <- length(x)
if(is.null(sub))
sub <- paste("n = ", n)
ec <- ecdf(x)
xx <- get("x", envir=environment(ec))# = sort(x)
yy <- get("y", envir=environment(ec))
D <- approx.ksD(n)
yyu <- pmin(yy+D, 1)
yyl <- pmax(yy-D, 0)
ecu <- stepfun(xx, c(yyu, 1) )
ecl <- stepfun(xx, c(yyl, yyl[n]) )
## Plots -- all calling plot.stepfun
plot(ec, main = main, sub = sub, xlab = xlab, ...)
plot(ecu, add=TRUE, verticals=TRUE, do.points=FALSE,
col.hor="red" , col.vert="red", ...)
plot(ecl, add=TRUE, verticals=TRUE, do.points=FALSE,
col.hor="red", col.vert="red", ...)
}
approx.ksD <- function(n)
{
## approximations for the critical level for Kolmogorov-Smirnov
## statistic D,
## for confidence level 0.95. Taken from Bickel & Doksum, table
IX,
## p.483
## and Lienert G.A.(1975) who attributes to Miller,L.H.(1956),
JASA
ifelse(n > 80,
1.358 /( sqrt(n) + .12 + .11/sqrt(n)),##Bickel&Doksum,
table
##IX,p.483
splinefun(c(1:9, 10, 15, 10 * 2:8),# from Lienert
c(.975, .84189, .70760, .62394, .56328,# 1:5
.51926, .48342, .45427, .43001, .40925,# 6:10
.33760, .29408, .24170, .21012,# 15,20,30,40
.18841, .17231, .15975, .14960)) (n))
}
\name{ecdf.ksCI}
\alias{ecdf.ksCI}
\title{ Plotting the empirical distribution function together with
confidence
curves. }
\description{ Plots the empirical distribution function for one-
dimensional
data, together with upper and lower confidence curves. Always
uses
pointwise confidence level of 95\%.
}
\usage{
ecdf.ksCI(x, main=NULL, sub=NULL, xlab = deparse(substitute(x)), ...)
}
%- maybe also `usage' for other objects documented here.
\arguments{
\item{x}{ \code{x} numerical vector of observations. }
\item{\dots}{ \code{\dots} arguments given to
\code{\link{plot.stepfun}}.}
}
\details{
Uses the \code{\link{stepfun}} package.
}
\value{
Nothing. Used for its side effect, to produce a plot.
}
\references{ Peter J. Bickel & Kjell A. Doksum: Mathematical
Statistics, Basic Ideas and Selected
Topics. Holden-Day, 1977. }
\author{ Kjetil Halvorsen }
\seealso{ \code{\link{ecdf}} and \code{\link{plot.stepfun}} in
package
\code{\link{stepfun}}. }
\examples{
ecdf.ksCI( rchisq(50,3) )
}
\keyword{ hplot }
\keyword{nonparametric}
Kjetil Halvorsen
> Hi,
>
> I was trying to draw an empirical distribution function with uniform
> confidence bands. So I tried to find a way to calculate values of the
> Kolmogorov-Smirnov Distribution but failed.
> I guess it must be hidden somewhere (since the ks-test is implemented),
> but I was unable to find it.
>
> Is there any way to do this?
>
> Thanks
>
> Leif Boysen
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Dear Leif,
If you look at the definition of ks.test, you'll find the lines
pkstwo <- function(x, tol = 1e-06) {
if (is.numeric(x))
x <- as.vector(x)
else stop("Argument x must be numeric")
p <- rep(0, length(x))
p[is.na(x)] <- NA
IND <- which(!is.na(x) & (x > 0))
if (length(IND) > 0) {
p[IND] <- .C("pkstwo", as.integer(length(x)), p =
as.double(x[IND]),
as.double(tol), PACKAGE = "ctest")$p
}
return(p)
}
which calls C code to calculate the p-values given the test statistic.
You'll find explanations on what this function does in the original C file
src/library/ctest/src/ks.c
I haven't tried that but I assume that you could use it to calculate
p-values
given the test-statistics yourself.
Please also note that ks.test() returns the p-value as well.
If you need quantiles, I assume you need to invert the cdf yourself,
e.g. using uniroot().
HTH
Thomas
---
Thomas Hotz
Research Associate in Medical Statistics
University of Leicester
United Kingdom
Department of Epidemiology and Public Health
22-28 Princess Road West
Leicester
LE1 6TP
Tel +44 116 252-5410
Fax +44 116 252-5423
Division of Medicine for the Elderly
Department of Medicine
The Glenfield Hospital
Leicester
LE3 9QP
Tel +44 116 256-3643
Fax +44 116 232-2976
> -----Original Message-----
> From: Leif.Boysen [mailto:boysen at math.uni-goettingen.de]
> Sent: 21 July 2003 14:42
> To: r-help at stat.math.ethz.ch
> Subject: [R] Confidence Band for empirical distribution function
>
>
> Hi,
>
> I was trying to draw an empirical distribution function with uniform
> confidence bands. So I tried to find a way to calculate values of the
> Kolmogorov-Smirnov Distribution but failed.
> I guess it must be hidden somewhere (since the ks-test is
> implemented),
> but I was unable to find it.
>
> Is there any way to do this?
>
> Thanks
>
> Leif Boysen
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
Dear Kjetil, As I already mentioned, it appears that there isn't a function available calculating the quantiles directly (at least, it doesn't appear in the C source of ctest). So as I already suggested, uniroot (or a similar C routine which calls the corresponding C code directly) is probably the best you can do (apart from writing it completely yourself). I didn't program this using uniroot, but I'd certainly try the following for speed-up: - For symmetry reasons, you only need to compute half of the quantiles. - The quantiles depend smoothly on the probabilities (of your reference distribution). Therefore, calculating only a "few" for probabilities between 0 and 0.5, and using (e.g. linear) interpolation should be satisfying. I am sorry not be of more help. HTH anyway Thomas> -----Original Message----- > From: kjetil brinchmann halvorsen [mailto:kjetil at entelnet.bo] > Sent: 23 July 2003 15:46 > To: Hotz, T. > Subject: RE: [R] Confidence Band for empirical distribution function > > > On 22 Jul 2003 at 11:37, Hotz, T. wrote: > > > Dear Leif, > > > > If you look at the definition of ks.test, you'll find the lines > > > > pkstwo <- function(x, tol = 1e-06) { > > if (is.numeric(x)) > > x <- as.vector(x) > > else stop("Argument x must be numeric") > > p <- rep(0, length(x)) > > p[is.na(x)] <- NA > > IND <- which(!is.na(x) & (x > 0)) > > if (length(IND) > 0) { > > p[IND] <- .C("pkstwo", as.integer(length(x)), p = > as.double(x[IND]), > > as.double(tol), PACKAGE = "ctest")$p > > } > > return(p) > > } > > > > which calls C code to calculate the p-values given the test > statistic. > > You'll find explanations on what this function does in the > original C file > > src/library/ctest/src/ks.c > > > > I haven't tried that but I assume that you could use it to > calculate p-values > > given the test-statistics yourself. > > That could certainly be done, but what was asked for is the inverse, > which can be calculated, using for instance uniroot(). I tried that, > but it is to slow, .C() will be called repeatedly in a loop. For me > it took several minutes. > > Kjetil Halvorsen > > > > > Please also note that ks.test() returns the p-value as well. > > > > If you need quantiles, I assume you need to invert the cdf yourself, > > e.g. using uniroot(). > > > > HTH > > > > Thomas > > > > --- > > > > Thomas Hotz > > Research Associate in Medical Statistics > > University of Leicester > > United Kingdom > > > > Department of Epidemiology and Public Health > > 22-28 Princess Road West > > Leicester > > LE1 6TP > > Tel +44 116 252-5410 > > Fax +44 116 252-5423 > > > > Division of Medicine for the Elderly > > Department of Medicine > > The Glenfield Hospital > > Leicester > > LE3 9QP > > Tel +44 116 256-3643 > > Fax +44 116 232-2976 > > > > > > > -----Original Message----- > > > From: Leif.Boysen [mailto:boysen at math.uni-goettingen.de] > > > Sent: 21 July 2003 14:42 > > > To: r-help at stat.math.ethz.ch > > > Subject: [R] Confidence Band for empirical distribution function > > > > > > > > > Hi, > > > > > > I was trying to draw an empirical distribution function > with uniform > > > confidence bands. So I tried to find a way to calculate > values of the > > > Kolmogorov-Smirnov Distribution but failed. > > > I guess it must be hidden somewhere (since the ks-test is > > > implemented), > > > but I was unable to find it. > > > > > > Is there any way to do this? > > > > > > Thanks > > > > > > Leif Boysen > > > > > > ______________________________________________ > > > R-help at stat.math.ethz.ch mailing list > > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > >