bbslover
2009-Nov-06 08:16 UTC
[R] another question: how to delete one of columes in two ones with high correlation(0.95)
my programe is below: a=c(1,2,1,1,1); b=c(1,2,3,4,1); c=c(3,4,3,3,3); d=c(1,2,3,5,1); e=c(1,5,3,5,1) data.f=data.frame(a,b,c,d,e) origin.data<-data.f cor.matrix<-cor(origin.data) origin.cor<-cor.matrix m<-0 for(i in 1:(cor.matrix[1]-1)) { for(j in (i+1):(cor.matrix[2])) { if (cor.matrix[i,j]>=0.95) { data.f<-data.f[,-i]; i<-i+1 } } } origin.cor data.f the result seems to be not righ. origin.cor a b c d e a 1.0000000 -0.0857493 1.0000000 -0.1336306 0.5590170 b -0.0857493 1.0000000 -0.0857493 0.9854509 0.7669650 c 1.0000000 -0.0857493 1.0000000 -0.1336306 0.5590170 d -0.1336306 0.9854509 -0.1336306 1.0000000 0.7470179 e 0.5590170 0.7669650 0.5590170 0.7470179 1.0000000> data.fb c d e 1 1 3 1 1 2 2 4 2 5 3 3 3 3 3 4 4 3 5 5 5 1 3 1 1 either colume b or colume d shold be deleted ,for they hight correlation(0.9854509), but the result not,why? -- View this message in context: http://old.nabble.com/another-question%3A-how-to-delete-one-of-columes-in-two-ones-with-high-correlation%280.95%29-tp26228174p26228174.html Sent from the R help mailing list archive at Nabble.com.
Nikhil Kaza
2009-Nov-06 18:07 UTC
[R] another question: how to delete one of columes in two ones with high correlation(0.95)
You need dim(cor.matrix)[1] Following might be better instead of a loop, to to get the row ids of a matrix (which(cor.matrix >=0.95) %/% dim(cor.matrix)[1])+1 for column ids use modulus instead of integer divison. (which(cor.matrix >=0.95) %% dim(cor.matrix)[1]) There are probably better ways than this. Nikhil but probably a better way to do this would be On 6 Nov 2009, at 3:16AM, bbslover wrote:> for(i in 1:(cor.matrix[1]-1)) > { > for(j in (i+1):(cor.matrix[2])) > { > if (cor.matrix[i,j]>=0.95) > { > data.f<-data.f[,-i]; > i<-i+1 > } > } > }
bbslover
2009-Nov-07 06:42 UTC
[R] another question: how to delete one of columes in two ones with high correlation(0.95)
thank you. I need learn it, after that, maybe I can understant it well. thank Nikhil Nikhil Kaza-2 wrote:> > You need dim(cor.matrix)[1] > > Following might be better instead of a loop, to to get the row ids of > a matrix > > (which(cor.matrix >=0.95) %/% dim(cor.matrix)[1])+1 > > for column ids use modulus instead of integer divison. > > (which(cor.matrix >=0.95) %% dim(cor.matrix)[1]) > > There are probably better ways than this. > > Nikhil > > but probably a better way to do this would be > > On 6 Nov 2009, at 3:16AM, bbslover wrote: > >> for(i in 1:(cor.matrix[1]-1)) >> { >> for(j in (i+1):(cor.matrix[2])) >> { >> if (cor.matrix[i,j]>=0.95) >> { >> data.f<-data.f[,-i]; >> i<-i+1 >> } >> } >> } > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://old.nabble.com/another-question%3A-how-to-delete-one-of-columes-in-two-ones-with-high-correlation%280.95%29-tp26228174p26240884.html Sent from the R help mailing list archive at Nabble.com.