Michael Rennie
2008-Jul-15 13:07 UTC
[R] manipulating (extracting) data from distance matrices
Hi all, Does anyone have any tips for extracting chunks of data from a distance matrix? For instance, if one was interested in only a subset of distance comparisons (i.e., that of rows 4 thru 6, and no others), is there a simple way to pull that data out?>From some playing around with an example (below), I've been able tofigure out that a distance matrix in R is stored as a single vector, running top to bottom and left to right, so if you know the size of your distance matrix, you can figure out which elements to query and stick them together using c(). However, all this stuff is still indexed by the "labels" attribute. Does anyone know of a way to use that to pull out subsets from the distance matrix in a simpler manner than my example code below? ############## # ex_dist.R # example for # manipulating # distance matrices #################### set.seed<-12345 a<-sample(20:40, 10) b<-sample(80:100, 10) c<-sample(0:40, 10) dat<-data.frame(a,b,c) dat dmat<-dist(dat, method="euclidean") dmat dmat[1:6] #vector that stores the distance matrix runs descending down columns, left to right #in a 10-element distance matrix, column lengths are 9,8,7,6....1 #get comparisons of rows 1:4 (from dat) ONLY #top-left matrix will consist of top 3 of first column, top 2 of second col, top 1 or third col. topleft<-c(dmat[1:3],dmat[10:11],dmat[18]) topleft #get comparisons of rows 9:10 (from dat) ONLY #bottom right 4 bottomright<-c(dmat[8:9],dmat[16:17]) bottomright #######end##### I'm sure there's a simpler way to do this using the labels of the distance matrix, but I can't see it. I've thought of converting it using as.matrix(), which would allow me to pull out particular rows, but I'm only interested in the triangluar matrix. Now, if there was a way to as.matrix(dmat) such that I got the bottom triangular matrix and zeros elsewhere, then I'd be in buisness. Any suggestions on how to pull that off would be helpful. I'm certainly interested in any tips or tricks anyone might have for working with distance matrices, or any material that people can point me towards. Cheers, Mike -- Michael D. Rennie Ph.D. Candidate University of Toronto at Mississauga 3359 Missisagua Rd. N. Mississauga, ON L5L 1C6 Ph: 905-828-5452 Fax: 905-828-3792 www.utm.utoronto.ca/~w3rennie
stephen sefick
2008-Jul-15 13:35 UTC
[R] manipulating (extracting) data from distance matrices
how about this f <- as.matrix(dmat) f[,4:6] #you get repeats but I think this is what you want On Tue, Jul 15, 2008 at 9:07 AM, Michael Rennie <mdrennie@gmail.com> wrote:> Hi all, > > Does anyone have any tips for extracting chunks of data from a distance > matrix? > > For instance, if one was interested in only a subset of distance > comparisons (i.e., that of rows 4 thru 6, and no others), is there a > simple way to pull that data out? > > >From some playing around with an example (below), I've been able to > figure out that a distance matrix in R is stored as a single vector, > running top to bottom and left to right, so if you know the size of > your distance matrix, you can figure out which elements to query and > stick them together using c(). > > However, all this stuff is still indexed by the "labels" attribute. > Does anyone know of a way to use that to pull out subsets from the > distance matrix in a simpler manner than my example code below? > > ############## > # ex_dist.R > # example for > # manipulating > # distance matrices > #################### > > set.seed<-12345 > > a<-sample(20:40, 10) > b<-sample(80:100, 10) > c<-sample(0:40, 10) > > dat<-data.frame(a,b,c) > dat > > dmat<-dist(dat, method="euclidean") > dmat > > dmat[1:6] #vector that stores the distance matrix runs descending down > columns, left to right > > #in a 10-element distance matrix, column lengths are 9,8,7,6....1 > > #get comparisons of rows 1:4 (from dat) ONLY > #top-left matrix will consist of top 3 of first column, top 2 of > second col, top 1 or third col. > > topleft<-c(dmat[1:3],dmat[10:11],dmat[18]) > topleft > > #get comparisons of rows 9:10 (from dat) ONLY > #bottom right 4 > > bottomright<-c(dmat[8:9],dmat[16:17]) > bottomright > > #######end##### > > I'm sure there's a simpler way to do this using the labels of the > distance matrix, but I can't see it. I've thought of converting it > using as.matrix(), which would allow me to pull out particular rows, > but I'm only interested in the triangluar matrix. Now, if there was a > way to as.matrix(dmat) such that I got the bottom triangular matrix > and zeros elsewhere, then I'd be in buisness. Any suggestions on how > to pull that off would be helpful. > > I'm certainly interested in any tips or tricks anyone might have for > working with distance matrices, or any material that people can point > me towards. > > Cheers, > > Mike > > -- > Michael D. Rennie > Ph.D. Candidate > University of Toronto at Mississauga > 3359 Missisagua Rd. N. > Mississauga, ON L5L 1C6 > Ph: 905-828-5452 Fax: 905-828-3792 > www.utm.utoronto.ca/~w3rennie <http://www.utm.utoronto.ca/%7Ew3rennie> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]]