Frank Hedler
2008-Oct-07 16:27 UTC
[R] vectorization of a loop for mahalanobis distance calculation
Dear all,
We have a data frame x with n people as rows and k variables as columns.
Now, for each person (i.e., each row) we want to calculate a distance
between him/her and EACH other person in x. In other words, we want to
create a n x n matrix with distances (with zeros in the diagonal).
However, we do not want to calculate Euclidian distances. We want to calculate
Mahalanobis distances, which take into account the covariance among
variables.
Below is the piece of code we wrote ("covmat" in the function below is
the
variance-covariance matrix among variables in Data that has to be fed into
mahalonobis function we are using).
mahadist = function(x, covmat) {
dismat = matrix(0,ncol=nrow(x),nrow=nrow(x))
for (i in 1:nrow(x)) {
dismat[i,] = mahalanobis(as.matrix(x), as.matrix(x[i,]), covmat)^.5
}
return(dismat)
}
This piece of code works, but it is very slow. We were wondering if it's at
all possible to somehow vectorize this function. Any help would be greatly
appreciated.
Thanks,
Frank
[[alternative HTML version deleted]]