A Ezhil
2012-Mar-08 23:57 UTC
[R] Correlation between 2 matrices but with subset of variables
Dear All, I have two matrices A (40 x 732) and B (40 x 1230) and would like to calculate correlation between them. I can use: cor(A,B, method="pearson") to calculate correlation between all possible pairs. But the issue is that there is one-many specific mappings between A and B and I just need to calculate correlations for those pairs (not all). Some variables in A (proteins, say p1) have more than 3 (or 2 or 1) corresponding mapping in B (mRNA, say, m1,m2,m3) and I would like calculate correlations between p1-m1, p1-m2, and p1-m3 and then for the second variable p2 etc. I have the mapping information in another file (annotation file). Could you please suggest me how to do that? Thanks in advance. Kind regards,Ezhil [[alternative HTML version deleted]]
chuck.01
2012-Mar-09 01:34 UTC
[R] Correlation between 2 matrices but with subset of variables
Example data would be helpful A Ezhil wrote> > Dear All, > I have two matrices A (40 x 732) and B (40 x 1230) and would like to > calculate correlation between them. ?I can use: cor(A,B, method="pearson") > to calculate correlation between all possible pairs. But the issue is that > there is one-many specific mappings between A and B and I just need to > calculate correlations for those pairs (not all). Some variables in A > (proteins, say p1) have more than 3 (or 2 or 1) corresponding mapping in B > (mRNA, say, m1,m2,m3) and I would like calculate correlations between > p1-m1, p1-m2, and p1-m3 and then for the second variable p2 etc.? > I have the mapping information in another file (annotation file). Could > you please suggest me how to do that? > Thanks in advance. > Kind regards,Ezhil ? > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- View this message in context: http://r.789695.n4.nabble.com/Correlation-between-2-matrices-but-with-subset-of-variables-tp4458073p4458277.html Sent from the R help mailing list archive at Nabble.com.
R. Michael Weylandt
2012-Mar-09 02:29 UTC
[R] Correlation between 2 matrices but with subset of variables
Well, it would be possible to set something up to select out just the right pairs each time, but on my system the following a <- matrix(rnorm(40 * 732), 40) b <- matrix(rnorm(40 * 1230), 40) system.time(cor(a,b,method = "pearson")) takes about a tenth of a second so any more elective approach will probably loose in the interpreter time needed for all the subsetting. The slowest calculation of this sort (Kendall correlation) takes about 30 seconds. In short, just do cor(A,B, method = "pearson") and then subset from the overall correlation matrix. Michael On Thu, Mar 8, 2012 at 6:57 PM, A Ezhil <ezhil02 at yahoo.com> wrote:> Dear All, > I have two matrices A (40 x 732) and B (40 x 1230) and would like to calculate correlation between them. ?I can use: cor(A,B, method="pearson") to calculate correlation between all possible pairs. But the issue is that there is one-many specific mappings between A and B and I just need to calculate correlations for those pairs (not all). Some variables in A (proteins, say p1) have more than 3 (or 2 or 1) corresponding mapping in B (mRNA, say, m1,m2,m3) and I would like calculate correlations between p1-m1, p1-m2, and p1-m3 and then for the second variable p2 etc. > I have the mapping information in another file (annotation file). Could you please suggest me how to do that? > Thanks in advance. > Kind regards,Ezhil > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Petr Savicky
2012-Mar-09 07:48 UTC
[R] Correlation between 2 matrices but with subset of variables
On Thu, Mar 08, 2012 at 03:57:06PM -0800, A Ezhil wrote:> Dear All, > I have two matrices A (40 x 732) and B (40 x 1230) and would like to calculate correlation between them. ?I can use: cor(A,B, method="pearson") to calculate correlation between all possible pairs. But the issue is that there is one-many specific mappings between A and B and I just need to calculate correlations for those pairs (not all). Some variables in A (proteins, say p1) have more than 3 (or 2 or 1) corresponding mapping in B (mRNA, say, m1,m2,m3) and I would like calculate correlations between p1-m1, p1-m2, and p1-m3 and then for the second variable p2 etc.? > I have the mapping information in another file (annotation file). Could you please suggest me how to do that?Hi. Try the following. 1. Create some simple data X <- matrix(rnorm(15), nrow=5, ncol=3) Y <- matrix(rnorm(25), nrow=5, ncol=5) 2. Choose a table of pairs of columns, for which the correlation should be computed, and expand the matrices. ind <- rbind( c(1, 1), c(1, 2), c(2, 2), c(3, 3), c(3, 4), c(3, 5)) X1 <- X[, ind[, 1]] Y1 <- Y[, ind[, 2]] 3. Compute the correlations between X1[, i] and Y1[, i] and compare to the diagonal of cor(X1, Y1) parallel.cor <- function(X, Y) { X <- sweep(X, 2, colMeans(X)) Y <- sweep(Y, 2, colMeans(Y)) colSums(X*Y)/sqrt(colSums(X^2)*colSums(Y^2)) } out <- parallel.cor(X1, Y1) verif <- diag(cor(X1, Y1)) all.equal(out, verif) [1] TRUE Hope this helps. Petr Savicky.
A Ezhil
2012-Mar-10 00:54 UTC
[R] Correlation between 2 matrices but with subset of variables
Dear Peter, Thank you very much. Best regards, Ezhil --- On Fri, 3/9/12, Petr Savicky <savicky at cs.cas.cz> wrote:> From: Petr Savicky <savicky at cs.cas.cz> > Subject: Re: [R] Correlation between 2 matrices but with subset of variables > To: r-help at r-project.org > Date: Friday, March 9, 2012, 1:18 PM > On Thu, Mar 08, 2012 at 03:57:06PM > -0800, A Ezhil wrote: > > Dear All, > > I have two matrices A (40 x 732) and B (40 x 1230) and > would like to calculate correlation between them. ?I can > use: cor(A,B, method="pearson") to calculate correlation > between all possible pairs. But the issue is that there is > one-many specific mappings between A and B and I just need > to calculate correlations for those pairs (not all). Some > variables in A (proteins, say p1) have more than 3 (or 2 or > 1) corresponding mapping in B (mRNA, say, m1,m2,m3) and I > would like calculate correlations between p1-m1, p1-m2, and > p1-m3 and then for the second variable p2 etc.? > > I have the mapping information in another file > (annotation file). Could you please suggest me how to do > that? > > Hi. > > Try the following. > > 1. Create some simple data > > ? X <- matrix(rnorm(15), nrow=5, ncol=3) > ? Y <- matrix(rnorm(25), nrow=5, ncol=5) > > 2. Choose a table of pairs of columns, for which the > correlation > ???should be computed, and expand the > matrices. > > ? ind <- rbind( > ? ? c(1, 1), > ? ? c(1, 2), > ? ? c(2, 2), > ? ? c(3, 3), > ? ? c(3, 4), > ? ? c(3, 5)) > > ? X1 <- X[, ind[, 1]] > ? Y1 <- Y[, ind[, 2]] > > 3. Compute the correlations between X1[, i] and Y1[, i] and > ???compare to the diagonal of cor(X1, Y1) > > ? parallel.cor <- function(X, Y) > ? { > ? ? ? X <- sweep(X, 2, colMeans(X)) > ? ? ? Y <- sweep(Y, 2, colMeans(Y)) > ? ? ? > colSums(X*Y)/sqrt(colSums(X^2)*colSums(Y^2)) > ? } > > ? out <- parallel.cor(X1, Y1) > ? verif <- diag(cor(X1, Y1)) > ? all.equal(out, verif) > > ? [1] TRUE > > Hope this helps. > > Petr Savicky. > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible > code. >