Hilmar Berger
2008-Jan-02 17:39 UTC
[R] strange behavior of cor() with pairwise.complete.obs
Hi all, I'm not quite sure if this is a feature or a bug or if I just fail to understand the documentation: If I use cor() with pairwise.complete.obs and method=pearson, the result is a scalar: ->cor(c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="pearson") [1] 0.9819805 The documentation says that " '"pairwise.complete.obs"' only works with the '"pearson"' method for 'cov' and 'var'." Thus, I guess that cor() should work for pairwise.complete.obs and method "kendall", or am I misinterpreting that statement ? -> c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="kendall") [,1] [1,] 1 Now the result is a matrix with dimensions (1,1) - strange enough. Note that when I use "all.obs" or "complete.obs" I get a scalar for method kendall, too. It gets worse if one tries to calculate the correlation between the columns of two matrices (i.e. cor(x,y) with x and y being a matrix). Then -> c=matrix(c(1,2,3,3,4,5),nrow=3,ncol=2) -> d=matrix(c(2,3,4,NA,NA,NA),nrow=3,ncol=2) -> cor(c,d,use="pairwise.complete.obs",method="pearson") [,1] [,2] [1,] 1 NA [2,] 1 NA -> cor(c,d,use="pairwise.complete.obs",method="kendall") Error: 'x' is empty (*translated from german error message*) The behavior is reproducible in R 2.4.1 and 2.6.1 (WinXP). I noticed that in 2.7.0 something was fixed in cor() related to "complete.obs" handling - would that fix my problems ? Any suggestions ? Thanks, Hilmar
Daniel Malter
2008-Jan-02 19:07 UTC
[R] strange behavior of cor() with pairwise.complete.obs
Sorry, I did not get it at first. Now I see your problem. I accidentally used pearson. So it does not work for kendall's tau or spearman's rho. The reason why it does not work is because there is one column full of NAs in the second matrix. It works if you have more than one valid value to compare to in each of the matrix columns: m=matrix(c(1,2,3,3,4,5),nrow=3,ncol=2) d=matrix(c(2,3,4,9,5,NA),nrow=3,ncol=2) cor(d,m,method="k",use="pairwise.complete.obs") I don't know if that helps, but it is an explanation. Why pearson works and spearman and kendall don't, I don't know. Cheers, Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von Hilmar Berger Gesendet: Wednesday, January 02, 2008 12:39 PM An: r-help at stat.math.ethz.ch Betreff: [R] strange behavior of cor() with pairwise.complete.obs Hi all, I'm not quite sure if this is a feature or a bug or if I just fail to understand the documentation: If I use cor() with pairwise.complete.obs and method=pearson, the result is a scalar: ->cor(c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="pearson") [1] 0.9819805 The documentation says that " '"pairwise.complete.obs"' only works with the '"pearson"' method for 'cov' and 'var'." Thus, I guess that cor() should work for pairwise.complete.obs and method "kendall", or am I misinterpreting that statement ? -> c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="kendall") [,1] [1,] 1 Now the result is a matrix with dimensions (1,1) - strange enough. Note that when I use "all.obs" or "complete.obs" I get a scalar for method kendall, too. It gets worse if one tries to calculate the correlation between the columns of two matrices (i.e. cor(x,y) with x and y being a matrix). Then -> c=matrix(c(1,2,3,3,4,5),nrow=3,ncol=2) -> d=matrix(c(2,3,4,NA,NA,NA),nrow=3,ncol=2) -> cor(c,d,use="pairwise.complete.obs",method="pearson") [,1] [,2] [1,] 1 NA [2,] 1 NA -> cor(c,d,use="pairwise.complete.obs",method="kendall") Error: 'x' is empty (*translated from german error message*) The behavior is reproducible in R 2.4.1 and 2.6.1 (WinXP). I noticed that in 2.7.0 something was fixed in cor() related to "complete.obs" handling - would that fix my problems ? Any suggestions ? Thanks, Hilmar ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hilmar Berger
2008-Jan-02 20:41 UTC
[R] strange behavior of cor() with pairwise.complete.obs
Sorry, I obviously did not state clearly what the problem is (thanks Daniel): 1. minor problem: cor() does return different types of variables for methods "kendall" and pearson (matrix vs. scalar) when pairwise.complete.obs is selected. 2. major problem: cor() does return with an error if both x and y are matrices with method="kendall" when pairwise.complete.obs is selected and one column of one of the two matrices is completely NA. This does not happen for method "pearson". Regards, Hilmar Hilmar Berger <hilmar.berger <at> imise.uni-leipzig.de> writes:> > Hi all, > > I'm not quite sure if this is a feature or a bug or if I just fail to > understand > the documentation: > > If I use cor() with pairwise.complete.obs and method=pearson, the > result is a > scalar: > > ->cor(c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="pearson") > [1] 0.9819805 > > The documentation says that > " '"pairwise.complete.obs"' only works with the '"pearson"' method > for 'cov' and 'var'." > > Thus, I guess that cor() should work for pairwise.complete.obs and > method > "kendall", or am I misinterpreting that statement ? > > -> c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="kendall") > [,1] > [1,] 1 > > Now the result is a matrix with dimensions (1,1) - strange enough. > > Note that when I use "all.obs" or "complete.obs" I get a scalar for > method > kendall, too. > > It gets worse if one tries to calculate the correlation between the > columns of > two matrices (i.e. cor(x,y) with x and y being a matrix). Then > > -> c=matrix(c(1,2,3,3,4,5),nrow=3,ncol=2) > -> d=matrix(c(2,3,4,NA,NA,NA),nrow=3,ncol=2) > -> cor(c,d,use="pairwise.complete.obs",method="pearson") > [,1] [,2] > [1,] 1 NA > [2,] 1 NA > > -> cor(c,d,use="pairwise.complete.obs",method="kendall") > Error: 'x' is empty (*translated from german error message*) > > The behavior is reproducible in R 2.4.1 and 2.6.1 (WinXP). I noticed > that in > 2.7.0 something was fixed in cor() related to "complete.obs" handling > - would > that fix my problems ? > > Any suggestions ? > > Thanks, > Hilmar > > ______________________________________________ > R-help <at> r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Maybe Matching Threads
- cor, cov, method "pairwise.complete.obs"
- bug in cor (..., use= ...)?
- how to see inbuilt function(cor.test) & how to get p-value from t-value(test of significance) ?
- number of observations used in cor when use="pairwise.obs"
- bug? in stats::cor for use=complete.obs with NAs