Prof Brian Ripley
2021-Feb-17 10:20 UTC
[Rd] issue with print()ing multibyte characters on R 4.0.4
On 17/02/2021 04:58, Hiroaki Yutani wrote:> Hi all, > > I saw several people on Japanese locale claim that, on R 4.0.4, > print() doesn't display > Japanese characters correctly. This seems to happen only on Windows > and on macOS (I > usually use Linux and I don't see this problem). > > For example, in the result below, "?" and "?" are displayed in > "\uXXXX" format. What's > curious here is that "?" is displayed as it is, by the way. > >> "???" > [1] "\u9b3c?\u5916" > > But, if I use such functions as message() or cat(), the string is > displayed as it is. > >> message("???") > ???that does not escape non-printable characters, so as expected.> > Considering the fact that it seems only Windows and macOS are > affected, I suspect this > is somehow related to this change described in the release note, > (though I have no idea > what change this is): > > The internal table for iswprint (used on Windows, macOS and AIX) has been > updated to include many recent Unicode characters. > (https://cran.r-project.org/doc/manuals/r-release/NEWS.html) > > Before I'm going to file this issue on Bugzilla, I'd like to confirm > if this is not the intended > change, and, if this is actually intended, I want to discuss how to > improve this behaviour.I am sorry: this was not intended but it was no one reported in the run up to 4.0.4. It seems to be working in R-devel so I suggest you check that or go back to 4.0.3. It looks like a line in the iswprint table got deleted in the merge from R-devel. I will try to set up some automated checks to see if I can find any other problems, but that will take a few days. -- Brian D. Ripley, ripley at stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford
Hiroaki Yutani
2021-Feb-17 13:47 UTC
[Rd] issue with print()ing multibyte characters on R 4.0.4
Thanks for confirming and investigating.> but it was no one reported in the run up to 4.0.4.Yes, it was unfortunate that no one had reported it to the right place before the release... 2021?2?17?(?) 19:20 Prof Brian Ripley <ripley at stats.ox.ac.uk>:> > On 17/02/2021 04:58, Hiroaki Yutani wrote: > > Hi all, > > > > I saw several people on Japanese locale claim that, on R 4.0.4, > > print() doesn't display > > Japanese characters correctly. This seems to happen only on Windows > > and on macOS (I > > usually use Linux and I don't see this problem). > > > > For example, in the result below, "?" and "?" are displayed in > > "\uXXXX" format. What's > > curious here is that "?" is displayed as it is, by the way. > > > >> "???" > > [1] "\u9b3c?\u5916" > > > > But, if I use such functions as message() or cat(), the string is > > displayed as it is. > > > >> message("???") > > ??? > > that does not escape non-printable characters, so as expected. > > > > Considering the fact that it seems only Windows and macOS are > > affected, I suspect this > > is somehow related to this change described in the release note, > > (though I have no idea > > what change this is): > > > > The internal table for iswprint (used on Windows, macOS and AIX) has been > > updated to include many recent Unicode characters. > > (https://cran.r-project.org/doc/manuals/r-release/NEWS.html) > > > > Before I'm going to file this issue on Bugzilla, I'd like to confirm > > if this is not the intended > > change, and, if this is actually intended, I want to discuss how to > > improve this behaviour. > > I am sorry: this was not intended but it was no one reported in the run > up to 4.0.4. It seems to be working in R-devel so I suggest you check > that or go back to 4.0.3. > > It looks like a line in the iswprint table got deleted in the merge from > R-devel. I will try to set up some automated checks to see if I can > find any other problems, but that will take a few days. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford