> On Dec 15, 2014, at 12:21 PM, Kurt Hornik <Kurt.Hornik at wu.ac.at> wrote: > >>>>>> Spencer Graves writes: > >> Hello, All: >> What would it take to make ?iconv? portable? > > >> I ask, because I want to convert accented characters to >> vanilla ASCII, thereby converting, e.g., ?Ra?l? to ?Raul?, and >> Milan Bouchet-Valet suggested on R-help that I use 'iconv(x, >> ?", "ASCII//TRANSLIT?)?. This worked under Windows but failed >> on Linux and Mac. It?s part of the ?subNonStandardCharacters? >> function in the Ecfun package. The development version on >> R-Forge uses this and returns ?Raul? under Windows and NA >> under Mac OS X (and presumably also Linux). > > Hmm. > > R> iconv("Ra?l", "", "ASCII//TRANSLIT") > [1] "Raul" > > seems to work for me on Linux ... >also on OS X:> iconv("Ra?l", "", "ASCII//TRANSLIT")[1] "Ra'ul"> -k > > >> The ?iconv? R code merely calls compiled code, which I?ve used very little in 30 years. > > >> Thanks, >> Spencer > > > >>> On Nov 30, 2014, at 2:32 AM, Spencer Graves <spencer.graves at structuremonitoring.com <mailto:spencer.graves at structuremonitoring.com>> wrote: >>> >>> Wonderful. Thanks very much. Spencer >>> >>> >>> On 11/30/2014 2:25 AM, Milan Bouchet-Valat wrote: > >> [[alternative HTML version deleted]] > >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
> On Dec 15, 2014, at 10:13 AM, Simon Urbanek <simon.urbanek at r-project.org> wrote: > >> >> On Dec 15, 2014, at 12:21 PM, Kurt Hornik <Kurt.Hornik at wu.ac.at> wrote: >> >>>>>>> Spencer Graves writes: >> >>> Hello, All: >>> What would it take to make ?iconv? portable? >> >> >>> I ask, because I want to convert accented characters to >>> vanilla ASCII, thereby converting, e.g., ?Ra?l? to ?Raul?, and >>> Milan Bouchet-Valet suggested on R-help that I use 'iconv(x, >>> ?", "ASCII//TRANSLIT?)?. This worked under Windows but failed >>> on Linux and Mac. It?s part of the ?subNonStandardCharacters? >>> function in the Ecfun package. The development version on >>> R-Forge uses this and returns ?Raul? under Windows and NA >>> under Mac OS X (and presumably also Linux). >> >> Hmm. >> >> R> iconv("Ra?l", "", "ASCII//TRANSLIT") >> [1] "Raul" >> >> seems to work for me on Linux ... >> > > also on OS X: > >> iconv("Ra?l", "", "ASCII//TRANSLIT") > [1] ?Ra'ul"Thanks for the replies. I should have checked my examples more carefully. Consider the following example and a slight modification from help(?iconv?):> x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher") > Encoding(x) <- "latin1" > x[1] "Ekstr?m" "J?reskog" "bi?chen Z?rcher"> iconv(x, "latin1", "ASCII//TRANSLIT") # platform-dependent[1] "Ekstrom" "J\"oreskog" "bisschen Z\"urcher"> > x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher") > x[1] "Ekstr\xf8m" "J\xf6reskog" "bi\xdfchen Z\xfcrcher"> iconv(x, "", "ASCII//TRANSLIT") # platform-dependent[1] NA NA NA This suggests a two-step fix to my problem: (1) Check Encoding(x) and set to ?latin1? if it?s ?unknown?. (2) Delete any new \? added by iconv. Thanks again, Spencer> > > >> -k >> >> >>> The ?iconv? R code merely calls compiled code, which I?ve used very little in 30 years. >> >> >>> Thanks, >>> Spencer >> >> >> >>>> On Nov 30, 2014, at 2:32 AM, Spencer Graves <spencer.graves at structuremonitoring.com <mailto:spencer.graves at structuremonitoring.com>> wrote: >>>> >>>> Wonderful. Thanks very much. Spencer >>>> >>>> >>>> On 11/30/2014 2:25 AM, Milan Bouchet-Valat wrote: >> >>> [[alternative HTML version deleted]] >> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> ______________________________________________ >> R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel <https://stat.ethz.ch/mailman/listinfo/r-devel>[[alternative HTML version deleted]]
On Dec 15, 2014, at 1:37 PM, Spencer Graves <spencer.graves at prodsyse.com> wrote:> > >> On Dec 15, 2014, at 10:13 AM, Simon Urbanek <simon.urbanek at r-project.org> wrote: >> >>> >>> On Dec 15, 2014, at 12:21 PM, Kurt Hornik <Kurt.Hornik at wu.ac.at> wrote: >>> >>>>>>>> Spencer Graves writes: >>> >>>> Hello, All: >>>> What would it take to make ?iconv? portable? >>> >>> >>>> I ask, because I want to convert accented characters to >>>> vanilla ASCII, thereby converting, e.g., ?Ra?l? to ?Raul?, and >>>> Milan Bouchet-Valet suggested on R-help that I use 'iconv(x, >>>> ?", "ASCII//TRANSLIT?)?. This worked under Windows but failed >>>> on Linux and Mac. It?s part of the ?subNonStandardCharacters? >>>> function in the Ecfun package. The development version on >>>> R-Forge uses this and returns ?Raul? under Windows and NA >>>> under Mac OS X (and presumably also Linux). >>> >>> Hmm. >>> >>> R> iconv("Ra?l", "", "ASCII//TRANSLIT") >>> [1] "Raul" >>> >>> seems to work for me on Linux ... >>> >> >> also on OS X: >> >>> iconv("Ra?l", "", "ASCII//TRANSLIT") >> [1] ?Ra'ul" > > > Thanks for the replies. I should have checked my examples more carefully. Consider the following example and a slight modification from help(?iconv?): > > > > x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher") > > Encoding(x) <- "latin1" > > x > [1] "Ekstr?m" "J?reskog" "bi?chen Z?rcher" > > iconv(x, "latin1", "ASCII//TRANSLIT") # platform-dependent > [1] "Ekstrom" "J\"oreskog" "bisschen Z\"urcher" > > > > x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher") > > x > [1] "Ekstr\xf8m" "J\xf6reskog" "bi\xdfchen Z\xfcrcher" > > iconv(x, "", "ASCII//TRANSLIT") # platform-dependent > [1] NA NA NA > > > This suggests a two-step fix to my problem: (1) Check Encoding(x) and set to ?latin1? if it?s ?unknown?.Well, that depends heavily on your source. In the above it is hand-crafted latin1 so if you don't declare it, the native encoding will be assumed - which can be anything and has nothing to do with your actual input in this particular case where you hand-constructed latin1.> (2) Delete any new \? added by iconv. >The whole point of translit is to create combinations of ASCII characters that represent the unicode characters, so " is just one many characters that can be used. Cheers, S> > Thanks again, > Spencer > >> >> >> >>> -k >>> >>> >>>> The ?iconv? R code merely calls compiled code, which I?ve used very little in 30 years. >>> >>> >>>> Thanks, >>>> Spencer >>> >>> >>> >>>>> On Nov 30, 2014, at 2:32 AM, Spencer Graves <spencer.graves at structuremonitoring.com <mailto:spencer.graves at structuremonitoring.com>> wrote: >>>>> >>>>> Wonderful. Thanks very much. Spencer >>>>> >>>>> >>>>> On 11/30/2014 2:25 AM, Milan Bouchet-Valat wrote: >>> >>>> [[alternative HTML version deleted]] >>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >