Duncan Murdoch
2018-Aug-12 15:30 UTC
[R] source script file that contains Unicode non-English characters
On 12/08/2018 3:09 AM, Faridedin Cheraghi wrote:> It was actually a .rmd file so you can get the coloring of the bug report > in your text editor. I changed the format to .txt.When I run your script on a Mac (in a UTF-8 locale), all lines work as expected. I'm guessing you are working on Windows, in a non-UTF-8 locale? Posting sessionInfo() would be helpful. Duncan Murdoch> > -Farid > > On Sun, Aug 12, 2018 at 7:24 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> > wrote: > >> ... and read the Posting Guide... only a few file types will ever make it >> through the mailing list so repeatedly sending files not among those few >> types would just be frustrating for everyone. >> >> On August 11, 2018 4:51:43 PM PDT, Jim Lemon <drjimlemon at gmail.com> wrote: >>> Hi Farid, >>> Whatever you attached has not gotten through. >>> >>> Jim >>> >>> On Sat, Aug 11, 2018 at 6:47 PM, Farid Ch <faridcher at gmail.com> wrote: >>>> Hi all, >>>> >>>> Please check the attached file. >>>> >>>> Thanks >>>> Farid >>>> >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> -- >> Sent from my phone. Please excuse my brevity. >> >> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Faridedin Cheraghi
2018-Aug-12 15:48 UTC
[R] source script file that contains Unicode non-English characters
that's right and I don't want to change my locale. my sessionInfo() : R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base thanks On Sun, Aug 12, 2018 at 8:00 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 12/08/2018 3:09 AM, Faridedin Cheraghi wrote: > >> It was actually a .rmd file so you can get the coloring of the bug report >> in your text editor. I changed the format to .txt. >> > > When I run your script on a Mac (in a UTF-8 locale), all lines work as > expected. I'm guessing you are working on Windows, in a non-UTF-8 locale? > > Posting sessionInfo() would be helpful. > > Duncan Murdoch > > > >> -Farid >> >> On Sun, Aug 12, 2018 at 7:24 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us >> > >> wrote: >> >> ... and read the Posting Guide... only a few file types will ever make it >>> through the mailing list so repeatedly sending files not among those few >>> types would just be frustrating for everyone. >>> >>> On August 11, 2018 4:51:43 PM PDT, Jim Lemon <drjimlemon at gmail.com> >>> wrote: >>> >>>> Hi Farid, >>>> Whatever you attached has not gotten through. >>>> >>>> Jim >>>> >>>> On Sat, Aug 11, 2018 at 6:47 PM, Farid Ch <faridcher at gmail.com> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Please check the attached file. >>>>> >>>>> Thanks >>>>> Farid >>>>> >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> >>>> http://www.R-project.org/posting-guide.html >>>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> -- >>> Sent from my phone. Please excuse my brevity. >>> >>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posti >>> ng-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >[[alternative HTML version deleted]]
Duncan Murdoch
2018-Aug-12 16:33 UTC
[R] source script file that contains Unicode non-English characters
On 12/08/2018 11:48 AM, Faridedin Cheraghi wrote:> that's right and I don't want to change my locale. my sessionInfo() :I think it could be another manifestation of a known bug on Windows, where strings are converted from UTF-8 to the current locale and back to UTF-8, a lossy conversion. This has been present for many years, and requires a lot of internal changes to fix, so I wouldn't hold your breath waiting for a fix. I believe the "right" fix is for R to always convert strings to UTF-8 internally. This wasn't possible when the internationalization code was added many years ago because not all platforms supported UTF-8. It would be a lot of work now, and since it isn't needed now on the platforms most developers use, it's not receiving a lot of attention. Your workaround file(script, encoding = "UTF-8") %T>% source() %>% close() # works fine is a nice way to avoid this problem. Duncan Murdoch> > R version 3.5.1 (2018-07-02) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows >= 8 x64 (build 9200) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats? ? ?graphics? grDevices utils? ? ?datasets? methods? ?base > > thanks > > On Sun, Aug 12, 2018 at 8:00 PM, Duncan Murdoch > <murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>> wrote: > > On 12/08/2018 3:09 AM, Faridedin Cheraghi wrote: > > It was actually a .rmd file so you can get the coloring of the > bug report > in your text editor. I changed the format to .txt. > > > When I run your script on a Mac (in a UTF-8 locale), all lines work > as expected.? I'm guessing you are working on Windows, in a > non-UTF-8 locale? > > Posting sessionInfo() would be helpful. > > Duncan Murdoch > > > > -Farid > > On Sun, Aug 12, 2018 at 7:24 AM, Jeff Newmiller > <jdnewmil at dcn.davis.ca.us <mailto:jdnewmil at dcn.davis.ca.us>> > wrote: > > ... and read the Posting Guide... only a few file types will > ever make it > through the mailing list so repeatedly sending files not > among those few > types would just be frustrating for everyone. > > On August 11, 2018 4:51:43 PM PDT, Jim Lemon > <drjimlemon at gmail.com <mailto:drjimlemon at gmail.com>> wrote: > > Hi Farid, > Whatever you attached has not gotten through. > > Jim > > On Sat, Aug 11, 2018 at 6:47 PM, Farid Ch > <faridcher at gmail.com <mailto:faridcher at gmail.com>> wrote: > > Hi all, > > Please check the attached file. > > Thanks > Farid > > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> > mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, > reproducible code. > > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> > mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, > reproducible code. > > > -- > Sent from my phone. Please excuse my brevity. > > > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing > list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible > code. > > >