hello,
i have been using cor.test() for calculating the correlation coefficient and p
values for some data. however, since the data consist of two dichotomous
sequences (actually just binary data), i understand that simply using the
pearson correlation is not sufficient. however, having done a bit of research i
found that the
tetrachoric correlation is what i am after. found the polycor package and the
polychor routine, which seem to do precisely what i want. however, i don't
get p values out of polychor, just the standard deviation.
so, in a rather naive way i have tried to write a function which will return a
list with similar fields as what one gets from cor.test(). not being terribly
strong with statistics though, i am not sure whether this is entirely correct.
could someone tell me if i am on the right track... or point out where i am
going wrong?
tetrachoric.test <- function(x, y) {
p <- polychor(x, y, std.err = TRUE)
#
p$statistic <- p$rho / sqrt(c(p$var))
#
p$estimate <- p$rho
p$p.value = 2 * (1 - pnorm(abs(p$statistic)))
p
}
the assumption is that the p value is the integration of the two tails of the
distribution?
> x <- as.integer(runif(20) > 0.5)
> y <- as.integer(runif(20) > 0.5)
> p <- tetrachoric.test(x, y)
> p$statistic
[1] -0.2616866> p$p.value
[1] 0.7935631> p$var
[,1]
[1,] 0.1452105
thanks for any help!
best regards,
andrew.