Hi r-users, I would like to know if R have any solution to the "Address standardization". The problem is to classify a database of addresses with the real addresses of a streets of Spain. Ideally, I would like to assign Postal code, census data and other geographic information. If this is not possible I would like to know solutions in R about text mining, text classification, distance within text data,... Any help will be appreciate Thanks in advance Ferran Carrascosa
> From: Ferran Carrascosa > > Hi r-users, > > I would like to know if R have any solution to the "Address > standardization". The problem is to classify a database of > addresses with the real addresses of a streets of Spain. > Ideally, I would like to assign Postal code, census data and > other geographic information.I have no idea about this one...> If this is not possible I would like to know solutions in R > about text mining, text classification, distance within text data,...RSiteSearch("text mining") produced hits that look relevant. Andy> Any help will be appreciate > > Thanks in advance > > Ferran Carrascosa > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Roger Bivand
2006-Mar-15 18:02 UTC
[R] Address matching (was R-help Digest, Vol 37, Issue 12)
On Sun, 12 Mar 2006, Ferran Carrascosa wrote:> Hi r-users, > > I would like to know if R have any solution to the "Address standardization". > The problem is to classify a database of addresses with the real > addresses of a streets of Spain. Ideally, I would like to assign > Postal code, census data and other geographic information.There are no such built-in databases in R, and commercial solutions are typically costly. I assume you have addresses, and need a spatial index or key to associate the addresses with polygon containers like postcodes or census tracts. If there are not too many and they are all close to each other, a GPS and a bike are very effective ... Watch out for false positives in commercial automatic address matching, they may be about 10% (mostly typing errors in input data, but can be honest errors in the software).> > If this is not possible I would like to know solutions in R about text > mining, text classification, distance within text data,... >RSiteSearch("text mining") is quite productive.> Any help will be appreciate > > Thanks in advance > > Ferran Carrascosa > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no