Takatsugu Kobayashi
2013-Oct-31 10:14 UTC
[R] Efficient way to convert covariance to Euclidian distance matrix
Hi RUsers, I am struggling to come up with an efficient vectorized way to convert 20Kx20K covariance matrix to a Euclidian distance matrix as a surrogate for dissimilarity matrix. Hopefully I can apply multidimensional scaling for mapping these 20K points (commercial products). I understand that Distance(ij) = sigma(i) + sigma(j) - 2cov(ij). Without replying on a slow loop, I appreciate if anyone can help me out with a better idea - guess lapply? Thank you very much. Taka [[alternative HTML version deleted]]
S Ellison
2013-Oct-31 11:16 UTC
[R] Efficient way to convert covariance to Euclidian distance matrix
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Takatsugu Kobayashi > > I am struggling to come up with an efficient vectorized way to convert > 20Kx20K covariance matrix to a Euclidian distance matrix as a surrogate for > dissimilarity matrix. Hopefully I can apply multidimensional scaling for > mapping these 20K points (commercial products). > > I understand that Distance(ij) = sigma(i) + sigma(j) - 2cov(ij).I suspect there's a typo or two in here. sigma(i)^2 + sigma(j)^2 - 2cov(ij) would be the variance of a difference x[i ]- x[j]. That's not in the same units as the difference itself, so one might well want the standard deviation of the difference, that is, sqrt(sigma(i)^2 + sigma(j)^2 - 2cov(ij)). I don't envy your attempt to work with 20k*20k matrices, though. That's about 3Gbytes per object, and a lot of distances for MDS to optimise. If it's just about visual display, perhaps prcomp on the original data would provide (visually) similar results without the overhead of a large covariance matrix? S Ellison ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}}
Rolf Turner
2013-Oct-31 23:01 UTC
[R] Efficient way to convert covariance to Euclidian distance matrix
On 10/31/13 23:14, Takatsugu Kobayashi wrote:> Hi RUsers, > > I am struggling to come up with an efficient vectorized way to convert > 20Kx20K covariance matrix to a Euclidian distance matrix as a surrogate for > dissimilarity matrix. Hopefully I can apply multidimensional scaling for > mapping these 20K points (commercial products). > > I understand that Distance(ij) = sigma(i) + sigma(j) - 2cov(ij). Without > replying on a slow loop, I appreciate if anyone can help me out with a > better idea - guess lapply?As S. Ellison has pointed out, you probably want sigma^2 rather than sigma. My suspicion is that with a 20K x 20K covariance matrix: * nothing will work * even if it did, the results would be meaningless numerical noise. I.e. Get real. That being said, for a *reasonable* size of covariance matrix, the following might do what you want: DM <- outer(diag(CM),diag(CM),"+") - 2*CM where "CM" is the covariance matrix. And then you might want to do DM <- sqrt(DM) to get back to the original units (as S. Ellison indicated). cheers, Rolf Turner