I wrote a function to calculate cosine distances between rows of a matrix.
It uses two loops and is slow. Any suggestions to speed this up? Thanks in
advance.
theta.dist <- function(x){
res <- matrix(NA, nrow(x), nrow(x))
for (i in 1:nrow(x)){
for(j in 1:nrow(x)){
if (i > j)
res[i, j] <- res[j, i]
else {
v1 <- x[i,]
v2 <- x[j,]
good <- !is.na(v1) & !is.na(v2)
v1 <- v1[good]
v2 <- v2[good]
theta <- acos(v1%*%v2 / sqrt(v1%*%v1 * v2%*%v2 )) / pi * 180
res[i,j] <- theta
}
}
}
as.dist(res)
}
I think this will do what you want, though there may be ways of speeding it
up further.
theta.dist <- function(x)
as.dist(acos(crossprod(t(x))/sqrt(crossprod(t(rowSums(x*x)))))/pi*180)
***********************************
Simon Gatehouse
CSIRO Exploration and Mining,
Newbigin Close off Julius Ave
North Ryde, NSW
Mail: PO Box 136, North Ryde
NSW 1670, Australia
Phone: 61 (2) 9490 8677
Fax: 61 (2) 9490 8921
Mobile: 61 0407 130 635
E-mail: simon.gatehouse@csiro.au
Web Page: http://www.csiro.au/ <http://www.csiro.au/>
-----Original Message-----
From: Xiao-Jun Ma [mailto:xma@arcturusag.com <mailto:xma@arcturusag.com> ]
Sent: Friday, November 28, 2003 10:02 AM
To: 'r-help@stat.math.ethz.ch '
Subject: [R] Getting rid of loops?
I wrote a function to calculate cosine distances between rows of a matrix.
It uses two loops and is slow. Any suggestions to speed this up? Thanks in
advance.
theta.dist <- function(x){
res <- matrix(NA, nrow(x), nrow(x))
for (i in 1:nrow(x)){
for(j in 1:nrow(x)){
if (i > j)
res[i, j] <- res[j, i]
else {
v1 <- x[i,]
v2 <- x[j,]
good <- !is.na(v1) & !is.na(v2)
v1 <- v1[good]
v2 <- v2[good]
theta <- acos(v1%*%v2 / sqrt(v1%*%v1 * v2%*%v2 )) / pi * 180
res[i,j] <- theta
}
}
}
as.dist(res)
}
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
<https://www.stat.math.ethz.ch/mailman/listinfo/r-help>
[[alternative HTML version deleted]]
Simon and Peter, Thanks for your help. Peter's function speeds it up 25x vs. my naive code! XiaoJun -----Original Message----- From: Peter Dalgaard To: Simon.Gatehouse at csiro.au Cc: r-help at stat.math.ethz.ch; Xiao-Jun Ma Sent: 02-12-03 15.57 Subject: Re: [R] Getting rid of loops? Simon.Gatehouse at csiro.au writes:> I think this will do what you want, though there may be ways ofspeeding it> up further. >theta.dist2 <- function(x) as.dist(acos(crossprod(t(x))/sqrt(crossprod(t(rowSums(x*x)))))/pi*180) Or, theta.dist <- function(x) as.dist(acos(cov2cor(crossprod(t(x))))/pi*180) Now, if only there was a way to tell cor() not to center the variables, we'd have as.dist(acos(cor(t(x),center=F))/pi*180) Unfortunately there's no such argument.> > theta.dist <- function(x){ > > res <- matrix(NA, nrow(x), nrow(x)) > > for (i in 1:nrow(x)){ > for(j in 1:nrow(x)){ > if (i > j) > res[i, j] <- res[j, i] > else { > v1 <- x[i,] > v2 <- x[j,] > good <- !is.na(v1) & !is.na(v2) > v1 <- v1[good] > v2 <- v2[good] > theta <- acos(v1%*%v2 / sqrt(v1%*%v1 * v2%*%v2 )) / pi * 180 > res[i,j] <- theta > } > } > } > as.dist(res) > } > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > <https://www.stat.math.ethz.ch/mailman/listinfo/r-help> > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >-- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help