Dear R Users,
Is the p-value reported in a two-tailed binomial exact
test in error or is it a feature?
If it is a feature, could someone provide a reference
for its two-tailed p-value computations?
Using Blaker's (2000 - Canad. J. Statist 28: 783-798)
approach,the p-value is the minimum of the two-tailed
probabilities $P \left(Y\geq y_{obs}\right)$ and
$P\left(Y\leq y_{obs}\right)$ plus an attainable
probability in the other tail that is as close as
possible to, but not greater than that one-tailed
probability. Consider the following examples
performed in R version 1.9
under windows 2000.
> binom.test(110,500,.2)
Exact binomial test
data: 110 and 500
number of successes = 110, number of trials = 500, p-
value = 0.2636
alternative hypothesis: true probability of success is
not equal to 0.2
95 percent confidence interval:
0.1844367 0.2589117
sample estimates:
probability of success
0.22
(P-value should be according to the Blaker definition
0.2881)
> binom.test(90,500,.2)
Exact binomial test
data: 90 and 500
number of successes = 90, number of trials = 500, p-
value = 0.2881
alternative hypothesis: true probability of success is
not equal to 0.2
95 percent confidence interval:
0.1473006 0.2165364
sample estimates:
probability of success
0.18
(P-value should be according to the Blaker definition
0.2647)
The following code (only checked on windows) reports
the Blaker definition of the p-value.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
bino.test <-
function(x, n, p = 0.5, alternative = c
("two.sided", "less", "greater"), conf.level =
0.95)
{
eps <- sqrt(.Machine$double.eps)
if(any(is.na(x) || (x < 0) || (x != round(x))))
stop("x must be nonnegative and
integer")
if(length(x) == 2) {
n <- sum(x)
x <- x[1]
}
else if(length(x) == 1) {
if((length(n) > 1) || is.na(n) || (n <
1) || (n != round(n)) || (x > n))
stop("n must be a positive
integer >= x")
}
else stop("incorrect length of x")
if(!missing(p) && (length(p) > 1 || is.na(p)
|| p < 0 || p > 1))
stop("p must be a single number
between 0 and 1")
alternative <- match.arg(alternative)
if(!((length(conf.level) == 1) && is.finite
(conf.level) && (conf.level > 0) &&
(conf.level < 1)))
stop("conf.level must be a single
number between 0 and 1")
DNAME <- paste(deparse(substitute(x)), "and",
deparse(substitute(n)))
PVAL <- switch(alternative,
less = pbinom(x, n, p),
greater = 1 - pbinom(x - 1, n, p),
{
pvec <- pbinom(0:n, n, p)
if(x/n < p) {
pb <- pbinom(x, n, p)
p2 <- max((1 - pvec)
[pb - (1 - pvec) >= - eps * pb], 0)
min(1, pb + p2)
}
else {
pb <- (1 - pbinom(x -
1, n, p))
p2 <- max(pvec[pb -
pvec >= - eps * pb], 0)
min(1, pb + p2)
}
}
)
p.L <- function(x, n, alpha)
{
if(x == 0)
0
else qbeta(alpha, x, n - x + 1)
}
p.U <- function(x, n, alpha)
{
if(x == n)
1
else qbeta(1 - alpha, x + 1, n - x)
}
CINT <- switch(alternative,
less = c(0, p.U(x, n, 1 - conf.level)),
greater = c(p.L(x, n, 1 - conf.level),
1),
two.sided = {
alpha <- (1 - conf.level)/2
c(p.L(x, n, alpha), p.U(x, n,
alpha))
}
)
attr(CINT, "conf.level") <- conf.level
ESTIMATE <- x/n
names(x) <- "number of successes"
names(n) <- "number of trials"
names(ESTIMATE) <- names(p) <- "probability of
success"
structure(list(statistic = x, parameter = n,
p.value = PVAL, conf.int = CINT, estimate =
ESTIMATE,
null.value = p, alternative =
alternative, method = "Exact binomial test",
data.name = DNAME),
class = "htest")
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Thanks in advance for the responses. I am not
currently on the R-mailing list. So, if you could
cc your responses to alan.arnholt at unavarra.es I would
be most thankful.
Alan Arnholt
---------------------------------------------
This message was sent using Endymion MailMan.
http://www.endymion.com/products/mailman/