Dear R-experts, I have two lists of US zip codes and want to pick the nearest zip code from second list against my first list.e.g.30043 (from second list) is closest to the zip code 30094 (from first list).So,it should come against 30094.The code should compare the distance from each zip and pick the nearest one. I have written the following code. It is giving proper results for many, but in mindist, it is showing 'NAs'. But for some of the zip codes, it is giving proper minimum distance. Please note it will be effective for 5 digit zip codes. Any help will be highly appreciated. df1<-read.csv("C:/Users/dxsur/Desktop/ZIP1.csv") df2<-read.csv("C:/Users/dxsur/Desktop/ZIP2.csv") results<-merge(x=df1,y=zipcode,all.x=TRUE) results1<-merge(x=df2,y=zipcode,all.x=TRUE) distance<-distm(subset(results,select=c(longitude,latitude)),subset(results1,select=c(longitude,latitude))) rnum=apply(distance, 1, which.min) mindist=apply(distance, 1, min) final<-cbind(results,results1$zip[unlist(rnum)],mindist) Thanks & Regards, *Debasmita * [[alternative HTML version deleted]]
verify that you actually have five-digit zip codes stored as characters. New Jersey and Massachusetts have zero as the first digit. When these codes are saved as numbers, they become four-digit codes and will probably cause errors. For example Cambridge, Mass is '02138', and would be reported as 2138 when interpreted as a number.. On Tue, Aug 4, 2020 at 9:29 PM Debasmita Sur <ds071185 at gmail.com> wrote:> > Dear R-experts, > I have two lists of US zip codes and want to pick the nearest zip code from > second list against my first list.e.g.30043 (from second list) is closest > to the zip code 30094 (from first list).So,it should come against 30094.The > code should compare the distance from each zip and pick the nearest one. > I have written the following code. It is giving proper results for many, > but in mindist, it is showing 'NAs'. But for some of the zip codes, it is > giving proper minimum distance. Please note it will be effective for 5 > digit zip codes. Any help will be highly appreciated. > > df1<-read.csv("C:/Users/dxsur/Desktop/ZIP1.csv") > df2<-read.csv("C:/Users/dxsur/Desktop/ZIP2.csv") > > results<-merge(x=df1,y=zipcode,all.x=TRUE) > results1<-merge(x=df2,y=zipcode,all.x=TRUE) > distance<-distm(subset(results,select=c(longitude,latitude)),subset(results1,select=c(longitude,latitude))) > > rnum=apply(distance, 1, which.min) > mindist=apply(distance, 1, min) > > final<-cbind(results,results1$zip[unlist(rnum)],mindist) > > > Thanks & Regards, > *Debasmita * > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
In addition to Rich's advice... as always, have you searched?! e.g. on "zip code distances" or similar at rseek.org. This appears to have been asked before and there are tools available. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Aug 4, 2020 at 6:29 PM Debasmita Sur <ds071185 at gmail.com> wrote:> Dear R-experts, > I have two lists of US zip codes and want to pick the nearest zip code from > second list against my first list.e.g.30043 (from second list) is closest > to the zip code 30094 (from first list).So,it should come against 30094.The > code should compare the distance from each zip and pick the nearest one. > I have written the following code. It is giving proper results for many, > but in mindist, it is showing 'NAs'. But for some of the zip codes, it is > giving proper minimum distance. Please note it will be effective for 5 > digit zip codes. Any help will be highly appreciated. > > df1<-read.csv("C:/Users/dxsur/Desktop/ZIP1.csv") > df2<-read.csv("C:/Users/dxsur/Desktop/ZIP2.csv") > > results<-merge(x=df1,y=zipcode,all.x=TRUE) > results1<-merge(x=df2,y=zipcode,all.x=TRUE) > > distance<-distm(subset(results,select=c(longitude,latitude)),subset(results1,select=c(longitude,latitude))) > > rnum=apply(distance, 1, which.min) > mindist=apply(distance, 1, min) > > final<-cbind(results,results1$zip[unlist(rnum)],mindist) > > > Thanks & Regards, > *Debasmita * > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Richard, I have not considered the 4 digit zip codes, I have taken only 5 digits. I have attached two folders, in the 'air' folder I have some specific zip codes and in output I got proper results, whereas in the 'par' folder I got 'NA's in the minimum distance column. Actually, the problem was to find the nearest store for a specific brand. Thanks, *Debasmita* On Wed, Aug 5, 2020 at 7:08 AM Richard M. Heiberger <rmh at temple.edu> wrote:> verify that you actually have five-digit zip codes stored as > characters. New Jersey and Massachusetts have zero as > the first digit. When these codes are saved as numbers, they become > four-digit codes and will probably cause errors. > For example Cambridge, Mass is '02138', and would be reported as 2138 > when interpreted as a number.. > > On Tue, Aug 4, 2020 at 9:29 PM Debasmita Sur <ds071185 at gmail.com> wrote: > > > > Dear R-experts, > > I have two lists of US zip codes and want to pick the nearest zip code > from > > second list against my first list.e.g.30043 (from second list) is closest > > to the zip code 30094 (from first list).So,it should come against > 30094.The > > code should compare the distance from each zip and pick the nearest one. > > I have written the following code. It is giving proper results for many, > > but in mindist, it is showing 'NAs'. But for some of the zip codes, it is > > giving proper minimum distance. Please note it will be effective for 5 > > digit zip codes. Any help will be highly appreciated. > > > > df1<-read.csv("C:/Users/dxsur/Desktop/ZIP1.csv") > > df2<-read.csv("C:/Users/dxsur/Desktop/ZIP2.csv") > > > > results<-merge(x=df1,y=zipcode,all.x=TRUE) > > results1<-merge(x=df2,y=zipcode,all.x=TRUE) > > > distance<-distm(subset(results,select=c(longitude,latitude)),subset(results1,select=c(longitude,latitude))) > > > > rnum=apply(distance, 1, which.min) > > mindist=apply(distance, 1, min) > > > > final<-cbind(results,results1$zip[unlist(rnum)],mindist) > > > > > > Thanks & Regards, > > *Debasmita * > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >