Dear R Gurus, As you probably know, dist calculates the distance between every two rows of data. What I am interested in is the actual two rows that have the least distance between them, rather than the numerical value of the distance itself. For example, If the minimum distance in the following sample run is d[14], which is .3826119, and the rows are 4 & 6. I need to find a generic way to retrieve these rows, for a generic matrix of NRows (in this example NRows=7) NCols=5 NRows=7 myMat<-matrix(runif(NCols*NRows), ncol=NCols) d<-dist(myMat) 1 2 3 4 5 6 2 0.7202138 3 0.7866527 0.9052319 4 0.6105235 1.0754259 0.8897555 5 0.5032729 1.0789359 0.9756421 0.4167131 6 0.6007685 0.6949224 0.3826119 0.7590029 0.7994574 7 0.9751200 1.2218754 1.0547197 0.5681905 0.7795579 0.8291303 e<-sort.list(d) e<-e[1:5] ##Retrieve minimum 5 distances [1] 14 16 4 18 5 -- View this message in context: http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2249844.html Sent from the R help mailing list archive at Nabble.com.
Hi there, I am sure there is a better way to do it, but here is a suggestion: res <- matrix(NA, ncol = 2, nrow = 5) for(i in 1:5) res[i, ] <- which(as.matrix(d) == sort(d)[i], arr.ind TRUE)[1,] res HTH, Jorge On Wed, Jun 9, 2010 at 11:30 PM, Jeff08 <> wrote:> > Dear R Gurus, > > As you probably know, dist calculates the distance between every two rows > of > data. What I am interested in is the actual two rows that have the least > distance between them, rather than the numerical value of the distance > itself. > > For example, If the minimum distance in the following sample run is d[14], > which is .3826119, and the rows are 4 & 6. I need to find a generic way to > retrieve these rows, for a generic matrix of NRows (in this example > NRows=7) > > NCols=5 > NRows=7 > myMat<-matrix(runif(NCols*NRows), ncol=NCols) > > d<-dist(myMat) > > 1 2 3 4 5 6 > 2 0.7202138 > 3 0.7866527 0.9052319 > 4 0.6105235 1.0754259 0.8897555 > 5 0.5032729 1.0789359 0.9756421 0.4167131 > 6 0.6007685 0.6949224 0.3826119 0.7590029 0.7994574 > 7 0.9751200 1.2218754 1.0547197 0.5681905 0.7795579 0.8291303 > > e<-sort.list(d) > e<-e[1:5] ##Retrieve minimum 5 distances > > [1] 14 16 4 18 5 > -- > View this message in context: > http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2249844.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Bill.Venables at csiro.au
2010-Jun-10 04:57 UTC
[R] Retrieving the 2 row of "dist" computations
This is a lazy way, and a slightly extravagant way if your memory is limited and you are dealing with large numbers of rows. NCols <- 5 NRows <- 7 myMat <- matrix(runif(NCols*NRows), ncol=NCols) d <- dist(myMat) dm <- as.matrix(d) diag(dm) <- Inf ij <- which(dm == min(dm), arr.ind = TRUE)[1,] ij -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jeff08 Sent: Thursday, 10 June 2010 1:30 PM To: r-help at r-project.org Subject: [R] Retrieving the 2 row of "dist" computations Dear R Gurus, As you probably know, dist calculates the distance between every two rows of data. What I am interested in is the actual two rows that have the least distance between them, rather than the numerical value of the distance itself. For example, If the minimum distance in the following sample run is d[14], which is .3826119, and the rows are 4 & 6. I need to find a generic way to retrieve these rows, for a generic matrix of NRows (in this example NRows=7) NCols=5 NRows=7 myMat<-matrix(runif(NCols*NRows), ncol=NCols) d<-dist(myMat) 1 2 3 4 5 6 2 0.7202138 3 0.7866527 0.9052319 4 0.6105235 1.0754259 0.8897555 5 0.5032729 1.0789359 0.9756421 0.4167131 6 0.6007685 0.6949224 0.3826119 0.7590029 0.7994574 7 0.9751200 1.2218754 1.0547197 0.5681905 0.7795579 0.8291303 e<-sort.list(d) e<-e[1:5] ##Retrieve minimum 5 distances [1] 14 16 4 18 5 -- View this message in context: http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2249844.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hey, The code definitely works, but I may need a more elegant way to do it. Rather than 5 rows, the full data contains 829 rows, so instead of d of length 10, d will be of length 343206. Jorge Ivan Velez wrote:> > Hi there, > > I am sure there is a better way to do it, but here is a suggestion: > > res <- matrix(NA, ncol = 2, nrow = 5) > for(i in 1:5) res[i, ] <- which(as.matrix(d) == sort(d)[i], arr.ind > TRUE)[1,] > res > > HTH, > Jorge > > > On Wed, Jun 9, 2010 at 11:30 PM, Jeff08 <> wrote: > >> >> Dear R Gurus, >> >> As you probably know, dist calculates the distance between every two rows >> of >> data. What I am interested in is the actual two rows that have the least >> distance between them, rather than the numerical value of the distance >> itself. >> >> For example, If the minimum distance in the following sample run is >> d[14], >> which is .3826119, and the rows are 4 & 6. I need to find a generic way >> to >> retrieve these rows, for a generic matrix of NRows (in this example >> NRows=7) >> >> NCols=5 >> NRows=7 >> myMat<-matrix(runif(NCols*NRows), ncol=NCols) >> >> d<-dist(myMat) >> >> 1 2 3 4 5 6 >> 2 0.7202138 >> 3 0.7866527 0.9052319 >> 4 0.6105235 1.0754259 0.8897555 >> 5 0.5032729 1.0789359 0.9756421 0.4167131 >> 6 0.6007685 0.6949224 0.3826119 0.7590029 0.7994574 >> 7 0.9751200 1.2218754 1.0547197 0.5681905 0.7795579 0.8291303 >> >> e<-sort.list(d) >> e<-e[1:5] ##Retrieve minimum 5 distances >> >> [1] 14 16 4 18 5 >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2249844.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2249900.html Sent from the R help mailing list archive at Nabble.com.