Gustavo Carvalho
2010-Sep-23 16:53 UTC
[R] looking for a faster way to compare two columns of a matrix
Please consider this matrix: x <- structure(c(5, 4, 3, 2, 1, 6, 3, 2, 1, 0, 3, 2, 1, 0, 0, 2, 1, 1, 0, 0, 2, 0, 0, 0, 0), .Dim = c(5L, 5L)) For each pair of columns, I want to calculate the proportion of entries different than 0 in column j (i > j) that have lower values than the entries in the same row in column i: x[, 1:2] sum((x[,1] > x[,2]) & (x[,2] > 0))/sum(x[,2] > 0) Thus, for this pair, 3 of the 4 entries in the second column are lower than the entries in the same row in the first column. When both columns of a given pair have the same number of cells different than 0, the value of the metric is 0. x[, 3:4] colSums(x[, 3:4] > 0) The same if column j has more valid (> 0) entries. I've been doing this using this idea: combinations <- combn(1:ncol(x), 2) values <- numeric(ncol(combinations)) for (i in 1:ncol(combinations)) { pair <- combinations[,i] first <- x[, pair[1]] second <- x[, pair[2]] if (sum(first > 0) <= sum(second > 0)) next values[i] <- sum(first - second > 0 & second > 0) / sum(second > 0) } values Anyway, I was wondering if there is a faster/better way. I've tried putting the code from the for loop into a function and passing it to combn but, as expected, it didn't help much. Any pointers to functions that I should be looking into will be greatly appreciated. Thank you very much, Gustavo.
Michael Bedward
2010-Sep-24 08:44 UTC
[R] looking for a faster way to compare two columns of a matrix
Hello Gustavo, Not sure if I've got all the details of your metric, but what about this... xx <- x[ , combn(5,2)] i <- seq(2, ncol(xx), 2) colSums(xx[,i-1] > xx[,i] & xx[,i] > 0) / colSums(xx[,i] > 0) Michael On 24 September 2010 02:53, Gustavo Carvalho <gustavo.bio+R at gmail.com> wrote:> Please consider this matrix: > > x <- structure(c(5, 4, 3, 2, 1, 6, 3, 2, 1, 0, 3, 2, 1, 0, 0, 2, 1, > 1, 0, 0, 2, 0, 0, 0, 0), .Dim = c(5L, 5L)) > > For each pair of columns, I want to calculate the proportion of entries > different than 0 in column j (i > j) that have lower values than the entries > in the same row in column i: > > x[, 1:2] > sum((x[,1] > x[,2]) & (x[,2] > 0))/sum(x[,2] > 0) > > Thus, for this pair, 3 of the 4 entries in the second column are > lower than the entries in the same row in the first column. > > When both columns of a given pair have the same number of cells different than > 0, the value of the metric is 0. > > x[, 3:4] > colSums(x[, 3:4] > 0) > > The same if column j has more valid (> 0) entries. > > I've been doing this using this idea: > > combinations <- combn(1:ncol(x), 2) > values <- numeric(ncol(combinations)) > > for (i in 1:ncol(combinations)) { > ?pair <- combinations[,i] > ?first <- x[, pair[1]] > ?second <- x[, pair[2]] > ?if (sum(first > 0) <= sum(second > 0)) next > ?values[i] <- sum(first - second > 0 & second > 0) / sum(second > 0) > } > values > > Anyway, I was wondering if there is a faster/better way. I've tried > putting the code from > the for loop into a function and passing it to combn but, as expected, it didn't > help much. Any pointers to functions that I should be looking into will be > greatly appreciated. > > Thank you very much, > > Gustavo. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >