Jack Tanner
2010-Jan-21 23:07 UTC
[R] correlation significance testing with multiple factor levels
[Apologies in advance if this is too "statistics" and not enough "R".] I've got an experiment with two sets of treatments. Each subject either received all treatments from set A or all treatments from set B. I can compute the N pairwise correlations for all treatments in either set using cor(). If I take the mean of these N pairwise correlations, I see that the effects of treatments in set A are much more correlated than the effects of treatments in set B. (Mean correlation for set A is 0.6, mean for set B is 0.1). This is probably wrongheaded, but I'd like to be able to report whether this is a significant difference. I know about cor.test(), but I don't know whether/how I can adapt that for my use case. Thanks in advance for your advice.
Greg Snow
2010-Jan-22 16:32 UTC
[R] correlation significance testing with multiple factor levels
If I understand your problem correctly, then one option is to use a permutation test. This tests the null hypothesis that set A and set B are identical and the only reason you are seeing a difference is due to random chance. The procedure is to randomly shuffle the observations between the sets and recomputed your test statistic (difference of mean of correlations). Here is one example of how you could do this using some simulated data (you should probably try different rho matrices to see how this performs under different situations). library(MASS) rho1 <- matrix(.6, 3, 3) diag(rho1) <- 1 rho2 <- matrix(.1, 3, 3) diag(rho2) <- 1 x1 <- mvrnorm(25, rep(0,3), rho1) x2 <- mvrnorm(25, rep(0,3), rho2) x <- rbind(x1,x2) g <- rep(1:2, each=25) tsfunc <- function(g){ c1 <- cor(x[g==1,]) c2 <- cor(x[g==2,]) m1 <- mean( c1[ lower.tri(c1) ] ) m2 <- mean( c2[ lower.tri(c2) ] ) m1-m2 } out <- replicate(999, tsfunc( sample(g) ) ) out <- c(tsfunc(g), out) hist(out) abline(v=out[1]) mean( out >= out[1] ) # one-sided p-value hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jack Tanner > Sent: Thursday, January 21, 2010 4:08 PM > To: r-help at stat.math.ethz.ch > Subject: [R] correlation significance testing with multiple factor > levels > > [Apologies in advance if this is too "statistics" and not enough "R".] > > I've got an experiment with two sets of treatments. Each subject either > received > all treatments from set A or all treatments from set B. > > I can compute the N pairwise correlations for all treatments in either > set using > cor(). If I take the mean of these N pairwise correlations, I see that > the > effects of treatments in set A are much more correlated than the > effects of > treatments in set B. (Mean correlation for set A is 0.6, mean for set B > is 0.1). > This is probably wrongheaded, but I'd like to be able to report whether > this is > a significant difference. I know about cor.test(), but I don't know > whether/how > I can adapt that for my use case. > > Thanks in advance for your advice. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.