onyourmark
2009-May-08 05:39 UTC
[R] if ((x >.2 || x<(-.2)) && (col(x)!=row(x))) {x=x[,-col(x)]}
Hi. I have a correlation matrix 'x' which is of size 923x923 I need to remove variables that are highly correlated. I don't have a sophisticated way of selecting which of the two in a highly correlated pair to remove. I thought I would just go through each entry of the correlation matrix and if it is greater than 0.6 (or less than -0.6) I will remove that column and then redo the check from scratch with the matrix (where now the matrix has one less column). As a test, I tried if ((x >.2 || x<(-.2)) && col(x)!=row(x)) {x=x[,-col(x)]} but it does not remove any columns. Also, I realized that I actually need to pull out the row that is associated with that variable as well, and so perhaps the section inside {} needs to be something like: {x=x[-col(x),-col(x)] Any idea on how to do this? Thank you. -- View this message in context: http://www.nabble.com/if-%28%28x-%3E.2-%7C%7C-x%3C%28-.2%29%29----%28col%28x%29%21%3Drow%28x%29%29%29-%7Bx%3Dx-%2C-col%28x%29-%7D-tp23440419p23440419.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2009-May-08 12:54 UTC
[R] if ((x >.2 || x<(-.2)) && (col(x)!=row(x))) {x=x[, -col(x)]}
You are trying to test the equality of a matrix to a scalar, which will produce a logical vector. You are also using && in an apparent attempt to conjoin a complex object which will probably not give you the results you expect in that context either since it would only return a single TRUE or FALSE. Use & in situations where you want element-wise comparisons of vectors. You might start by experimenting on a much smaller object and seeing how your efforts at indexing could be improved. > X <- matrix(c(runif(9)),nrow=3) > X [,1] [,2] [,3] [1,] 0.4151688 0.2116687 0.6049845 [2,] 0.7924464 0.6624862 0.8444203 [3,] 0.2634175 0.3357537 0.6923846 > X>.8 [,1] [,2] [,3] [1,] FALSE FALSE FALSE [2,] FALSE FALSE TRUE [3,] FALSE FALSE FALSE #Use apply to create a logical vector that flags the unwanted rows: > apply(X,1,function(x) max(x) > 0.8) [1] FALSE TRUE FALSE #Now use that construction on both rows and colums > X[-apply(X,1,function(x) max(x) > 0.8), -apply(X,2,function(x) max(x) > 0.8)] [,1] [,2] [1,] 0.6624862 0.8444203 [2,] 0.3357537 0.6923846 -- David On May 8, 2009, at 1:39 AM, onyourmark wrote:> > Hi. I have a correlation matrix 'x' which is of size 923x923 > > I need to remove variables that are highly correlated. I don't have a > sophisticated way of selecting which of the two in a highly > correlated pair > to remove. I thought I would just go through each entry of the > correlation > matrix and if it is greater than 0.6 (or less than -0.6) I will > remove that > column and then redo the check from scratch with the matrix (where > now the > matrix has one less column). > As a test, I tried if ((x >.2 || x<(-.2)) && col(x)!=row(x)) {x=x[,- > col(x)]} > but it does not remove any columns. > > Also, I realized that I actually need to pull out the row that is > associated > with that variable as well, and so perhaps the section inside {} > needs to be > something like: > {x=x[-col(x),-col(x)] > Any idea on how to do this? > Thank you. > -- > View this message in context: http://www.nabble.com/if-%28%28x-%3E.2-%7C%7C-x%3C%28-.2%29%29----%28col%28x%29%21%3Drow%28x%29%29%29-%7Bx%3Dx-%2C-col%28x%29-%7D-tp23440419p23440419.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT