Andrew McFadden
2008-Mar-12 20:47 UTC
[R] Distances between two datasets of x and y co-ordinates
Hi all I am trying to determine the distances between two datasets of x and y points. The number of points in dataset One is very small i.e. perhaps 5-10. The number of points in dataset Two is likely to be very large i.e. 20,000-30,000. My initial approach was to append the first dataset to the second and then carry out the calculation: dists <- as.matrix(dist(gis data from 2 * datasets)) However, the memory of the computer is not sufficient. A lot of calculations carried out in this situation are unnecessary as I only want approx 5 * 20,000 calculations versus 20,000 *20,000. x <- c(2660156,2663703,2658165,2659303,2661531,2660914) y <- c(6476767,6475013,6475487,6479659,6477004,6476388) data2<-cbind(x,y) x <- c(266500,2611111) y <- c(6478767,6485013) data1<-cbind(x,y) Any suggestions on how to do this would be appreciated. Regards Andrew Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St Upper Hutt ######################################################################## This email message and any attachment(s) is intended solely for the addressee(s) named above. The information it contains is confidential and may be legally privileged. Unauthorised use of the message, or the information it contains, may be unlawful. If you have received this message by mistake please call the sender immediately on 64 4 8940100 or notify us by return email and erase the original message and attachments. Thank you. The Ministry of Agriculture and Forestry accepts no responsibility for changes made to this email or to any attachments after transmission from the office. ######################################################################## [[alternative HTML version deleted]]
Sundar Dorai-Raj
2008-Mar-12 21:24 UTC
[R] Distances between two datasets of x and y co-ordinates
Andrew McFadden said the following on 3/12/2008 1:47 PM:> Hi all > > I am trying to determine the distances between two datasets of x and y > points. The number of points in dataset One is very small i.e. perhaps > 5-10. The number of points in dataset Two is likely to be very large > i.e. 20,000-30,000. My initial approach was to append the first dataset > to the second and then carry out the calculation: > > dists <- as.matrix(dist(gis data from 2 * datasets)) > > However, the memory of the computer is not sufficient. A lot of > calculations carried out in this situation are unnecessary as I only > want approx 5 * 20,000 calculations versus 20,000 *20,000. > > x <- c(2660156,2663703,2658165,2659303,2661531,2660914) > y <- c(6476767,6475013,6475487,6479659,6477004,6476388) > data2<-cbind(x,y) > > x <- c(266500,2611111) > y <- c(6478767,6485013) > data1<-cbind(x,y) > > Any suggestions on how to do this would be appreciated. > > Regards > > AndrewIf you're trying to find only the closest point in data1 to data2, then use knn (or knn1) in the 'class' package: library(class) nn <- knn1(data2, data1, 1:nrow(data2)) which gives you the rows in data1 closest to each row in data2. Then compute the distance: rowSums((data2[nn, ] - data1)^2)^0.5 HTH, --sundar
Bill.Venables at csiro.au
2008-Mar-13 00:27 UTC
[R] Distances between two datasets of x and y co-ordinates
Here's what I would try. Suppose x1, y1 and x2, y2 are the two data sets. z1 <- complex(real = x1, imaginary = y1) z2 <- complex(real = x2, imaginary = y2) dMat <- outer(z1, z2, function(z1, z2) Mod(z1-z2)) Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:Bill.Venables at csiro.au http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Andrew McFadden Sent: Thursday, 13 March 2008 6:47 AM To: r-help at r-project.org Subject: [R] Distances between two datasets of x and y co-ordinates Hi all I am trying to determine the distances between two datasets of x and y points. The number of points in dataset One is very small i.e. perhaps 5-10. The number of points in dataset Two is likely to be very large i.e. 20,000-30,000. My initial approach was to append the first dataset to the second and then carry out the calculation: dists <- as.matrix(dist(gis data from 2 * datasets)) However, the memory of the computer is not sufficient. A lot of calculations carried out in this situation are unnecessary as I only want approx 5 * 20,000 calculations versus 20,000 *20,000. x <- c(2660156,2663703,2658165,2659303,2661531,2660914) y <- c(6476767,6475013,6475487,6479659,6477004,6476388) data2<-cbind(x,y) x <- c(266500,2611111) y <- c(6478767,6485013) data1<-cbind(x,y) Any suggestions on how to do this would be appreciated. Regards Andrew Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St Upper Hutt ######################################################################## This email message and any attachment(s) is intended solely for the addressee(s) named above. The information it contains is confidential and may be legally privileged. Unauthorised use of the message, or the information it contains, may be unlawful. If you have received this message by mistake please call the sender immediately on 64 4 8940100 or notify us by return email and erase the original message and attachments. Thank you. The Ministry of Agriculture and Forestry accepts no responsibility for changes made to this email or to any attachments after transmission from the office. ######################################################################## [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
adrian at maths.uwa.edu.au
2008-Mar-14 03:52 UTC
[R] Distances between two datasets of x and y co-ordinates
Andrew McFadden <Andrew.McFadden at maf.govt.nz> writes:> I am trying to determine the distances between two datasets of x and y > points.This can be done efficiently in the package 'spatstat'. library(spatstat) crossdist(x1, y1, x2, y2) where x1, y1 are vectors of coordinates for the first set of points and x2, y2 for the second set. See help(crossdist.default) This is executed in C and is faster than using outer() or apply(). The result is a matrix giving the distance between each pair of points (the first point in the first dataset and the second point in the second set). If these datasets are large, you can of course run into trouble with the size of this matrix. If you just wanted to find the distance to the *nearest* point (or identify which point is nearest), use the function nncross(). Adrian Baddeley
Hi, How do I use nncross to measure min distances between point patterns i,j, ENSURING each point in pattern i is only connected to a single partner point in pattern j ? I do realise that there are many possible pairings between the i-j point pairs... perhaps some average min distance for many pairings would be better. Alternatively pairings could be randomly assigned. Basically, i am tracking a population of moving objects over time, and want to indirectly estimate the collective movement- which might be better estimated by telling nncross to randomly assign (somehow) paired nearest neighbour distances ; 1-to-1 rather than many-to-one. THanks! T -- View this message in context: http://n4.nabble.com/Distances-between-two-datasets-of-x-and-y-co-ordinates-tp850044p1597356.html Sent from the R help mailing list archive at Nabble.com.