Dennis Fisher
2005-Aug-13 00:11 UTC
[R] R/S-Plus/SAS yield different results for Kendall-tau and Spearman nonparametric regression
Colleagues, I ran some nonparametric regressions in R (run in RedHat Linux), then a colleague repeated the analyses in SAS. When we obtained different results, I tested S-Plus (same Linux box). And, got yet different results. I replicated the results with a small dataset: DATA: 37.5 23 37.5 13 25 16 25 12 100 15 12.5 19 50 20 100 13 100 10 100 10 100 16 50 10 87.5 13 100 15 50 11 100 14 50 19 87.5 20 100 20 37.5 20 100 13 100 14 50 15 100 17 100 14 Code for S-Plus and R: DATA <- read.table("NonparametricRegressionData") cor.test(DATA[,1], DATA[,2], method = "spearman") cor.test(DATA[,1], DATA[,2], method = "kendall") ------------------------------------------------------------- S-Plus (version 6) > cor.test(DATA[,1], DATA[,2], method = "spearman") Spearman's rank correlation data: DATA[, 1] and DATA[, 2] normal-z = -1.1028, p-value = 0.2701 alternative hypothesis: true rho is not equal to 0 sample estimates: rho -0.2247199 > cor.test(DATA[,1], DATA[,2], method = "kendall") Kendall's rank correlation tau data: DATA[, 1] and DATA[, 2] normal-z = -1.0583, p-value = 0.2899 alternative hypothesis: true tau is not equal to 0 sample estimates: tau -0.14 ------------------------------------------------------------ R 2.1.1 > cor.test(DATA[,1], DATA[,2], method = "spearman") Spearman's rank correlation rho data: DATA[, 1] and DATA[, 2] S = 3184, p-value = 0.2791 alternative hypothesis: true rho is not equal to 0 sample estimates: rho -0.2247199 Warning message: p-values may be incorrect due to ties in: cor.test.default(DATA[, 1], DATA[, 2], method = "spearman") > cor.test(DATA[,1], DATA[,2], method = "kendall") Kendall's rank correlation tau data: DATA[, 1] and DATA[, 2] z = -1.1948, p-value = 0.2322 alternative hypothesis: true tau is not equal to 0 sample estimates: tau -0.1705247 Warning message: Cannot compute exact p-value with ties in: cor.test.default(DATA[, 1], DATA[, 2], method = "kendall") ------------------------------------------ SAS Spearman: Rho: -0.22472 P: 0.2802 Kendall: Rho: -0.17052 P: 0.2899 Each of the programs yields some differences, possibly because of how ties are handled (R warns about this). Can anyone enlighten me? Dennis Fisher Dennis Fisher MD P < (The "P Less Than" Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]]
Peter Dalgaard
2005-Aug-15 03:18 UTC
[R] R/S-Plus/SAS yield different results for Kendall-tau and Spearman nonparametric regression
Dennis Fisher <fisher at plessthan.com> writes:> Colleagues, > I ran some nonparametric regressions in R (run in RedHat Linux), then > a colleague repeated the analyses in SAS. When we obtained different > results, I tested S-Plus (same Linux box). And, got yet different > results. I replicated the results with a small dataset: > > DATA:(They came across somewhat garbled, but we'll believe you...) ...> Each of the programs yields some differences, possibly because of how > ties are handled (R warns about this). Can anyone enlighten me?Ties are certainly involved in the Spearman case. There are more accurate expressions for the variance of the test statistic in the tied case, than the formula that R is using. As you see, the difference is not exactly huge (at least for a small number of ties), but it is something that we should get around to fixing. I assume that there is a similar issue with Kendall's tau. In addition, S-PLUS appears to modify the actual definition of the test statistic, which might be a matter of taste. (K's tau relies on counting concordant and discordant pairs relative to the total number of pairs, and with ties, some pairs will be undecided. You can either discard such pairs or count them as zeros. S-PLUS appears to be doing the latter. A quick test is to notice that x <- y <- rep(0:1,4) gives a tau that is less than 1 in S-PLUS but gives 1 in R.) -- O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907