Luigi Marongiu
2023-Mar-22 21:12 UTC
[R] How to test the difference between paired correlations?
Hello, I have three numerical variables and I would like to test if their correlation is significantly different. I have seen that there is a package that "Test the difference between two (paired or unpaired) correlations". [https://www.personality-project.org/r/html/paired.r.html] However, there is the need to convert the correlations to "z scores using the Fisher r-z transform". I have seen that there is another package that does that [https://search.r-project.org/CRAN/refmans/DescTools/html/FisherZ.html]. Yet, I do not understand how to process the data. Shall I pass the raw data or the correlations directly? I have made the following working example: ``` # define data v1 <- c(62.480, 59.492, 74.060, 88.519, 91.417, 53.907, 64.202, 62.426, 54.406, 88.117) v2 <- c(56.814, 42.005, 56.074, 65.990, 81.572, 53.855, 50.335, 63.537, 41.713, 78.265) v3 <- c(54.170, 64.224, 57.569, 85.089, 104.056, 48.713, 61.239, 60.290, 67.308, 71.179) # visual exploration par(mfrow=c(2, 1)) plot(v2~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), main="V1 vs V2") abline(lm(v2~v1)) plot(v3~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), main="V1 vs V3") abline(lm(v3~v1)) ## test differences in correlation # convert raw data into z-scores library(psych) library(DescTools) FisherZ(v1) # I cannot convert the raw data into z scores (same for the other variables):> [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN > Warning message: > In log((1 + rho)/(1 - rho)) : NaNs produced# convert correlations into z scores # (the correlation score of 0.79 has been converted into 1.08; is this correct?) FisherZ(lm(v2~v1)$coefficients[2])> v1 > 1.081667lm(v2~v1)$coefficients[2]> v1 > 0.7938164# apply test v1_v2 = FisherZ(lm(v2~v1)$coefficients[2]) v1_v3 = FisherZ(lm(v3~v1)$coefficients[2]) paired.r(v1_v2, v1_v3, yz=NULL, length(v1), n2=NULL, twotailed=TRUE)> Call: paired.r(xy = v1_v2, xz = v1_v3, yz = NULL, n = length(v1), n2 = NULL, > twotailed = TRUE) > [1] "test of difference between two independent correlations" > z = NaN With probability = NaNWarning messages: > 1: In log((1 + xy)/(1 - xy)) : NaNs produced > 2: In log((1 + xz)/(1 - xz)) : NaNs produced``` What is the right way to run this test? Shall I apply also yz? Thank you -- Best regards, Luigi
Ebert,Timothy Aaron
2023-Mar-23 00:23 UTC
[R] How to test the difference between paired correlations?
If you are open to other options: The null hypothesis is that there is no difference. If I have two equations y=x and y=z and there is no difference then it would not matter if an observation was from x or z. Randomize the x and z observations. For each randomization calculate a correlation for y=x and for y=z. At each iteration calculate the absolute value of the difference in the correlations. Generate a frequency distribution from 100,000+ randomizations. Find the observed difference in the frequency from random distributions. What proportion of observations are as large or larger than the observed. This is your p-value. Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Luigi Marongiu Sent: Wednesday, March 22, 2023 5:12 PM To: r-help <r-help at r-project.org> Subject: [R] How to test the difference between paired correlations? [External Email] Hello, I have three numerical variables and I would like to test if their correlation is significantly different. I have seen that there is a package that "Test the difference between two (paired or unpaired) correlations". [https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.personality-project.org%2Fr%2Fhtml%2Fpaired.r.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=S5T%2F1r%2BotV2BeL7S8bQFR0Avi4jDOuRX8N7LxACA6jg%3D&reserved=0] However, there is the need to convert the correlations to "z scores using the Fisher r-z transform". I have seen that there is another package that does that [https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsearch.r-project.org%2FCRAN%2Frefmans%2FDescTools%2Fhtml%2FFisherZ.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gI3vIHV5UnFbPSmeMyuCVvg9hpFCdF33qNgAXmOQOXU%3D&reserved=0]. Yet, I do not understand how to process the data. Shall I pass the raw data or the correlations directly? I have made the following working example: ``` # define data v1 <- c(62.480, 59.492, 74.060, 88.519, 91.417, 53.907, 64.202, 62.426, 54.406, 88.117) v2 <- c(56.814, 42.005, 56.074, 65.990, 81.572, 53.855, 50.335, 63.537, 41.713, 78.265) v3 <- c(54.170, 64.224, 57.569, 85.089, 104.056, 48.713, 61.239, 60.290, 67.308, 71.179) # visual exploration par(mfrow=c(2, 1)) plot(v2~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), main="V1 vs V2") abline(lm(v2~v1)) plot(v3~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), main="V1 vs V3") abline(lm(v3~v1)) ## test differences in correlation # convert raw data into z-scores library(psych) library(DescTools) FisherZ(v1) # I cannot convert the raw data into z scores (same for the other variables):> [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Warning message: > In log((1 + rho)/(1 - rho)) : NaNs produced# convert correlations into z scores # (the correlation score of 0.79 has been converted into 1.08; is this correct?) FisherZ(lm(v2~v1)$coefficients[2])> v1 > 1.081667lm(v2~v1)$coefficients[2]> v1 > 0.7938164# apply test v1_v2 = FisherZ(lm(v2~v1)$coefficients[2]) v1_v3 = FisherZ(lm(v3~v1)$coefficients[2]) paired.r(v1_v2, v1_v3, yz=NULL, length(v1), n2=NULL, twotailed=TRUE)> Call: paired.r(xy = v1_v2, xz = v1_v3, yz = NULL, n = length(v1), n2 = NULL, > twotailed = TRUE) > [1] "test of difference between two independent correlations" > z = NaN With probability = NaNWarning messages: > 1: In log((1 + xy)/(1 - xy)) : NaNs produced > 2: In log((1 + xz)/(1 - xz)) : NaNs produced``` What is the right way to run this test? Shall I apply also yz? Thank you -- Best regards, Luigi ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=C%2FnA51iGPdivPDSPOktAvvb7r%2BjACQCqvcAc5jVMPew%3D&reserved=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wKr4mBr%2ByPybzuj1xUzD1a75yfLHQ9uagaUfrE%2F6JDk%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code.