Frank Hedler
2008-Oct-07 16:27 UTC
[R] vectorization of a loop for mahalanobis distance calculation
Dear all, We have a data frame x with n people as rows and k variables as columns. Now, for each person (i.e., each row) we want to calculate a distance between him/her and EACH other person in x. In other words, we want to create a n x n matrix with distances (with zeros in the diagonal). However, we do not want to calculate Euclidian distances. We want to calculate Mahalanobis distances, which take into account the covariance among variables. Below is the piece of code we wrote ("covmat" in the function below is the variance-covariance matrix among variables in Data that has to be fed into mahalonobis function we are using). mahadist = function(x, covmat) { dismat = matrix(0,ncol=nrow(x),nrow=nrow(x)) for (i in 1:nrow(x)) { dismat[i,] = mahalanobis(as.matrix(x), as.matrix(x[i,]), covmat)^.5 } return(dismat) } This piece of code works, but it is very slow. We were wondering if it's at all possible to somehow vectorize this function. Any help would be greatly appreciated. Thanks, Frank [[alternative HTML version deleted]]