Jason Liao
2008-Jan-31 02:28 UTC
[R] fastest way to compute the squared Euclidean distance between two vectors in R
I have a program which needs to compute squared Euclidean distance
between two vectors million of times, which the Rprof shows is the
bottleneck. I wondered if there is any faster way than my own simple
function
distance2 = function(x1, x2)
{
temp = x1-x2
sum(temp*temp)
}
I have searched the R-help archives and can not find anything except
when the arguments are matrices. Thanks for any lead.
Jason
Jason Liao, http://www.geocities.com/jg_liao
Associate Professor of Biostatistics
Drexel University School of Public Health
1505 Race Street, Mail Stop 1033
Bellet Building, 6th Floor
Philadelphia, PA 19102
phone 215-762-3934
____________________________________________________________________________________
Looking for last minute shopping deals?
Dimitris Rizopoulos
2008-Jan-31 09:19 UTC
[R] fastest way to compute the squared Euclidean distance betweentwo vectors in R
hits=0.5 tests=BAYES_00,ETHZ_GEOCITIES,FORGED_RCVD_HELO
X-USF-Spam-Flag: NO
try this:
distance2 <- function (x1, x2) {
temp <- x1 - x2
sum(temp * temp)
}
x1 <- rnorm(1e06)
x2 <- rnorm(1e06)
system.time(for (i in 1:100) distance2(x1, x2))
system.time(for (i in 1:100) crossprod(x1 - x2))
I hope it helps.
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
----- Original Message -----
From: "Jason Liao" <jg_liao at yahoo.com>
To: <r-help at r-project.org>
Sent: Thursday, January 31, 2008 3:28 AM
Subject: [R] fastest way to compute the squared Euclidean distance
betweentwo vectors in R
>I have a program which needs to compute squared Euclidean distance
> between two vectors million of times, which the Rprof shows is the
> bottleneck. I wondered if there is any faster way than my own simple
> function
>
> distance2 = function(x1, x2)
> {
> temp = x1-x2
> sum(temp*temp)
> }
>
> I have searched the R-help archives and can not find anything except
> when the arguments are matrices. Thanks for any lead.
>
> Jason
>
> Jason Liao, http://www.geocities.com/jg_liao
> Associate Professor of Biostatistics
> Drexel University School of Public Health
> 1505 Race Street, Mail Stop 1033
> Bellet Building, 6th Floor
> Philadelphia, PA 19102
> phone 215-762-3934
>
>
>
>
____________________________________________________________________________________
> Looking for last minute shopping deals?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Hesen Peng
2008-Feb-01 06:51 UTC
[R] fastest way to compute the squared Euclidean distance between two vectors in R
Hi,
I wonder whether the following may help a little:
Since $\sum (x_i - y_i)^2 = \sum x_i^2 + \sum y_i^2 - 2 \sum x_i y_i$,
we can modify the function as:
temp.distance<-function(x,y){
sum(x^2) + sum(y^2) - 2* x %*% y
}
I think this may be helpful when you need to compute the bootstrap
distance for vectors from a matrix $X$. We simply need to
pre-calculate sum(x^2) for each vector $x$, and access the resulted
from designated vectors, say $xx$. Then we have:
xx<-RowSum(X^2)
another.distance<-function(i,j){ # i and j are the row index of vector x and
y
xx[i] + xx[j] - 2*x[i,] %*% x[j,]
}
On Jan 31, 2008 10:28 AM, Jason Liao <jg_liao at yahoo.com>
wrote:> I have a program which needs to compute squared Euclidean distance
> between two vectors million of times, which the Rprof shows is the
> bottleneck. I wondered if there is any faster way than my own simple
> function
>
> distance2 = function(x1, x2)
> {
> temp = x1-x2
> sum(temp*temp)
> }
>
> I have searched the R-help archives and can not find anything except
> when the arguments are matrices. Thanks for any lead.
>
> Jason
>
> Jason Liao, http://www.geocities.com/jg_liao
> Associate Professor of Biostatistics
> Drexel University School of Public Health
> 1505 Race Street, Mail Stop 1033
> Bellet Building, 6th Floor
> Philadelphia, PA 19102
> phone 215-762-3934
>
>
>
____________________________________________________________________________________
> Looking for last minute shopping deals?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
???
Hesen Peng
Department of Statistics
Fudan University
Shanghai, P. R. C.
François Pinard
2008-Feb-01 12:45 UTC
[R] fastest way to compute the squared Euclidean distance between two vectors in R
Jason Liao <jg_liao at yahoo.com> writes:> I have a program which needs to compute squared Euclidean distance > between two vectors million of times, which the Rprof shows is the > bottleneck. I wondered if there is any faster way than my own simple > function > > distance2 = function(x1, x2) > { > temp = x1-x2 > sum(temp*temp) > }You might try: distance2 <- function(x1, x2) crossprod(x1-x2) And if you do not have to pass the distance2 function itself to some other function, you might also spare the indirection trhough the distance2 function and call crossprod directly (replacing the comma by a minus sign :-). -- Fran?ois Pinard http://pinard.progiciels-bpi.ca
Seemingly Similar Threads
- how much performance penalty does this incur, scalar as a vector of one element?
- extremely slow recursion in R?
- Linking R with Fortran 90: make: m2c: Command not found
- help needed: taking a function out of a package and it can not find some funtions
- help needed: taking a function out of a package and it can not find some funtions