Sachinthaka Abeywardana
2013-Mar-08 03:36 UTC
[R] getting covariance ignoring NaN missing values
Hi all, I have a matrix that has many NaN values. As soon as one of the columns has a missing (NaN) value the covariance estimation gets thrown off. Is there a robust way to do this? Thanks, Sachin a=array(rnorm(9),dim=c(3,3))> a [,1] [,2] [,3] [1,] -0.79418236 0.7813952 0.855881 [2,] -1.65347906 -1.9462446 -0.376325 [3,] -0.03144987 0.6756862 -1.879801> a[3,2]=NANError: object 'NAN' not found> a[3,2]=NaN> a [,1] [,2] [,3] [1,] -0.79418236 0.7813952 0.855881 [2,] -1.65347906 -1.9462446 -0.376325 [3,] -0.03144987 NaN -1.879801> cov(a) [,1] [,2] [,3] [1,] 0.6585217 NA -0.5777408 [2,] NA NA NA [3,] -0.5777408 NA 1.8771214 [[alternative HTML version deleted]]
Hi, If you look at ?cov(), there are options for 'use': set.seed(15) a=array(rnorm(9),dim=c(3,3)) ?a[3,2]<- NaN ?cov(a,use="complete.obs") #?????????? [,1]??????? [,2]?????? [,3] #[1,]? 1.2360602 -0.32167789? 0.8395953 #[2,] -0.3216779? 0.08371491 -0.2185001 #[3,]? 0.8395953 -0.21850006? 0.5702960 ?cov(a,use="na.or.complete") #?????????? [,1]??????? [,2]?????? [,3] #[1,]? 1.2360602 -0.32167789? 0.8395953 #[2,] -0.3216779? 0.08371491 -0.2185001 #[3,]? 0.8395953 -0.21850006? 0.5702960 ?cov(a,use="pairwise.complete.obs") #?????????? [,1]??????? [,2]?????? [,3] #[1,]? 1.2570603 -0.32167789? 0.7377472 #[2,] -0.3216779? 0.08371491 -0.2185001 #[3,]? 0.7377472 -0.21850006? 0.4433438 A.K. ----- Original Message ----- From: Sachinthaka Abeywardana <sachin.abeywardana at gmail.com> To: "r-help at r-project.org" <r-help at r-project.org> Cc: Sent: Thursday, March 7, 2013 10:36 PM Subject: [R] getting covariance ignoring NaN missing values Hi all, I have a matrix that has many NaN values. As soon as one of the columns has a missing (NaN) value the covariance estimation gets thrown off. Is there a robust way to do this? Thanks, Sachin a=array(rnorm(9),dim=c(3,3))> a? ? ? ? ? ? [,1]? ? ? [,2]? ? ? [,3] [1,] -0.79418236? 0.7813952? 0.855881 [2,] -1.65347906 -1.9462446 -0.376325 [3,] -0.03144987? 0.6756862 -1.879801> a[3,2]=NANError: object 'NAN' not found> a[3,2]=NaN> a? ? ? ? ? ? [,1]? ? ? [,2]? ? ? [,3] [1,] -0.79418236? 0.7813952? 0.855881 [2,] -1.65347906 -1.9462446 -0.376325 [3,] -0.03144987? ? ? ? NaN -1.879801> cov(a)? ? ? ? ? [,1] [,2]? ? ? [,3] [1,]? 0.6585217? NA -0.5777408 [2,]? ? ? ? NA? NA? ? ? ? NA [3,] -0.5777408? NA? 1.8771214 ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.