Dear all, For removing correlated columns in a data frame,df. I found a code written in R in the page http://cheminfo.informatics.indiana.edu/~rguha/code/R/ of Mr.Rajarshi Guha. The code is ################# r2test <- function(df, cutoff=0.8) { if (cutoff > 1 || cutoff <= 0) { stop(" 0 <= cutoff < 1") } if (!is.matrix(d) && !is.data.frame(d)) { stop("Must supply a data.frame or matrix") } r2cut = sqrt(cutoff); cormat <- cor(d); bad.idx <- which(abs(cormat)>r2cut,arr.ind=T); bad.idx <- matrix( bad.idx[bad.idx[,1] > bad.idx[,2]], ncol=2); drop.idx <- ifelse(runif(nrow(bad.idx)) > .5, bad.idx[,1], bad.idx [,2]); if (length(drop.idx) == 0) { 1:ncol(d) } else { (1:ncol(d))[-unique(drop.idx)] } } ############################################ Now the problem is the code return different output (i.e. different column number) for a different call. I could not understood why it happens from that code, but I can understand the logic in code except the line ******************************************** drop.idx <- ifelse(runif(nrow(bad.idx)) > .5, bad.idx[,1], bad.idx [,2]); **************************************** what it means by comparing > 0.5 of nrow(bad.idx). So I am looking for anyone to help me for different output generation between the different function call as well as meaning of the line which I mentioned above. Thanks! B.Nataraj