Hi all, I've been struggling learning R and need to turn to the list again. I've got a dataset (comma-delimited file) with the following fields: recid, latitude, longitude, population, dwelling and age. For each observation, I'd like to calculate the total number of people and dwellings and average age within 2 k.m. Distance could be Euclidean, however, a proper distance calculation (great circle route) is best. Any assistance would be appreciated. Thanks, Danny -------------- Sample Data -------------- recid,lat,long,pop,dwell,age 10010265,47.5971174,-52.7039227,584,219,38 10010260,47.5846616,-52.7039147,488,188,34 10010263,47.5936538,-52.7037037,605,232,43 10010287,47.5739426,-52.7035365,548,256,29 10010290,47.5703333,-52.703182,559,336,36 10010284,47.5800199,-52.7013245,394,261,61 10010191,47.5322617,-52.7010442,892,323,23 10010291,47.57004,-52.7009,0,0,0 10010289,47.57141,-52.70023,0,0,0 10010285,47.5832183,-52.6995828,469,239,44 10010273,47.6006838,-52.6984875,855,283,28 10010190,47.472353,-52.697991,0,0,0 10010274,47.6018197,-52.6978362,344,117,51 10010288,47.5755249,-52.6978207,33,0,19 10010275,47.6005037,-52.6968299,232,93,43 10010279,47.5915368,-52.6954916,983,437,33 10010276,47.5993086,-52.6954808,329,131,28 10010278,47.5958782,-52.6934253,251,107,27 10010354,47.6165839,-52.6934037,27,14,47 10010277,47.5975113,-52.6914148,515,194,37 10010293,47.5778754,-52.6910827,58,0,40 10010292,47.5722183,-52.6899332,1112,523,28 10010353,47.6356972,-52.6896838,1387,471,32 10010283,47.5873992,-52.6884621,531,296,41 10010281,47.5983891,-52.6880528,307,113,52 10010280,47.5958439,-52.6878177,374,129,18 10010282,47.5999645,-52.6874407,637,226,22 10010286,47.5797909,-52.6872042,446,280,32 10010355,47.6210282,-52.6777189,197,72,39
On Tue, 24 Feb 2004 dsheuman at rogers.com wrote:> Hi all, > > I've been struggling learning R and need to turn to the list again. > > I've got a dataset (comma-delimited file) with the following fields: > recid, latitude, longitude, population, dwelling and age. For each > observation, I'd like to calculate the total number of people and > dwellings and average age within 2 k.m. Distance could be Euclidean, > however, a proper distance calculation (great circle route) is best.One possibility using the spdep package is:> names(sds)[1] "recid" "lat" "long" "pop" "dwell" "age"> library(spdep) > sds.nb2 <- dnearneigh(as.matrix(sds[,2:3]), 0, 2, lonlat=TRUE) > unlist(lapply(sds.nb2, function(x) ifelse(any(x==0), 0,+ sum(sds$pop[x]))))+sds$pop [1] 9123 10017 9123 8821 6279 10017 892 7061 8245 10654 9010 0 [13] 9010 10017 9010 10681 10122 10122 7627 9574 10654 8034 1611 10095 [25] 9771 9574 8856 9465 2555 using dnearneigh() with the lonlat argument to build up a list of neighbouring points. There are also functions for lonlat distances in the fields package. Roger> > Any assistance would be appreciated. > > Thanks, > > Danny > > > -------------- > Sample Data > -------------- > recid,lat,long,pop,dwell,age > 10010265,47.5971174,-52.7039227,584,219,38 > 10010260,47.5846616,-52.7039147,488,188,34 > 10010263,47.5936538,-52.7037037,605,232,43 > 10010287,47.5739426,-52.7035365,548,256,29 > 10010290,47.5703333,-52.703182,559,336,36 > 10010284,47.5800199,-52.7013245,394,261,61 > 10010191,47.5322617,-52.7010442,892,323,23 > 10010291,47.57004,-52.7009,0,0,0 > 10010289,47.57141,-52.70023,0,0,0 > 10010285,47.5832183,-52.6995828,469,239,44 > 10010273,47.6006838,-52.6984875,855,283,28 > 10010190,47.472353,-52.697991,0,0,0 > 10010274,47.6018197,-52.6978362,344,117,51 > 10010288,47.5755249,-52.6978207,33,0,19 > 10010275,47.6005037,-52.6968299,232,93,43 > 10010279,47.5915368,-52.6954916,983,437,33 > 10010276,47.5993086,-52.6954808,329,131,28 > 10010278,47.5958782,-52.6934253,251,107,27 > 10010354,47.6165839,-52.6934037,27,14,47 > 10010277,47.5975113,-52.6914148,515,194,37 > 10010293,47.5778754,-52.6910827,58,0,40 > 10010292,47.5722183,-52.6899332,1112,523,28 > 10010353,47.6356972,-52.6896838,1387,471,32 > 10010283,47.5873992,-52.6884621,531,296,41 > 10010281,47.5983891,-52.6880528,307,113,52 > 10010280,47.5958439,-52.6878177,374,129,18 > 10010282,47.5999645,-52.6874407,637,226,22 > 10010286,47.5797909,-52.6872042,446,280,32 > 10010355,47.6210282,-52.6777189,197,72,39 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: Roger.Bivand at nhh.no
Assume that gcirc(x,y) takes two 2-vectors which are the latitude and longitude of x and the latitude and longitude of y and produces the the greatest circle distance in kilometers between them. # read in the data as a matrix d <- as.matrix( read.csv(myfile) ) # vectorize gcirc, i.e. allow it to have args which are matrices # with one coordinate per row. The first two lines of this # function optionally allow either arg to have 1 row even when the # other does not gcirc.vec <- function(x,y) { if (nrow(x) == 1) x <- matrix(x, nrow(y), ncol(x), byrow=T) if (nrow(y) == 1) y <- matrix(y, nrow(x), ncol(y), byrow=T) mapply(function(ii,jj) gcirc(x[ii,],y[ii,]), 1:nrow(x), 1:nrow(y)) } # and perform two applies: f <- function(x, v) sum( d[ gcirc.vec(t(x[2:3]),d[,2:3]) < 2, v ] ) apply(d,1,f,"pop") apply(d,1,f,"dwell") --- Date: Tue, 24 Feb 2004 9:58:45 -0500 From: <dsheuman at rogers.com> To: <R-help at stat.math.ethz.ch> Subject: [R] Calculate Distance and Aggregate Data? Hi all, I've been struggling learning R and need to turn to the list again. I've got a dataset (comma-delimited file) with the following fields: recid, latitude, longitude, population, dwelling and age. For each observation, I'd like to calculate the total number of people and dwellings and average age within 2 k.m. Distance could be Euclidean, however, a proper distance calculation (great circle route) is best. Any assistance would be appreciated. Thanks, Danny -------------- Sample Data -------------- recid,lat,long,pop,dwell,age 10010265,47.5971174,-52.7039227,584,219,38 10010260,47.5846616,-52.7039147,488,188,34 10010263,47.5936538,-52.7037037,605,232,43 10010287,47.5739426,-52.7035365,548,256,29 10010290,47.5703333,-52.703182,559,336,36 10010284,47.5800199,-52.7013245,394,261,61 10010191,47.5322617,-52.7010442,892,323,23 10010291,47.57004,-52.7009,0,0,0 10010289,47.57141,-52.70023,0,0,0 10010285,47.5832183,-52.6995828,469,239,44 10010273,47.6006838,-52.6984875,855,283,28 10010190,47.472353,-52.697991,0,0,0 10010274,47.6018197,-52.6978362,344,117,51 10010288,47.5755249,-52.6978207,33,0,19 10010275,47.6005037,-52.6968299,232,93,43 10010279,47.5915368,-52.6954916,983,437,33 10010276,47.5993086,-52.6954808,329,131,28 10010278,47.5958782,-52.6934253,251,107,27 10010354,47.6165839,-52.6934037,27,14,47 10010277,47.5975113,-52.6914148,515,194,37 10010293,47.5778754,-52.6910827,58,0,40 10010292,47.5722183,-52.6899332,1112,523,28 10010353,47.6356972,-52.6896838,1387,471,32 10010283,47.5873992,-52.6884621,531,296,41 10010281,47.5983891,-52.6880528,307,113,52 10010280,47.5958439,-52.6878177,374,129,18 10010282,47.5999645,-52.6874407,637,226,22 10010286,47.5797909,-52.6872042,446,280,32 10010355,47.6210282,-52.6777189,197,72,39
dsheuman at rogers.com wrote:> Hi all, > > I've been struggling learning R and need to turn to the list again. > > I've got a dataset (comma-delimited file) with the following fields: > recid, latitude, longitude, population, dwelling and age. For each > observation, I'd like to calculate the total number of people and dwellings > and average age within 2 k.m. Distance could be Euclidean, however, a > proper distance calculation (great circle route) is best. >A good approximation is the haversine formula, see: http://www.census.gov/cgi-bin/geo/gisfaq?Q5.1 for the spherical approximation and various corrections to account for the earth's departure from sphericity. Jim