Garbled characters appear in Chinese annotation when opening program script using RGui (see attached picture). I use a variety of methods have not been solved, I hope to help me solve this problem. Thank you.
Spencer Graves
2022-Apr-25 11:16 UTC
[R] about opening R script Chinese annotation garble problem
Attachments are NOT delivered to the readers of this list. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This includes sessionInfo(). If you can reproduce it on multiple platforms, that would help. Can you can post the image to the web someplace (e.g., a Google Drive) and provide the link? That would help, also. Thanks, Spencer Graves On 4/23/22 8:38 PM, ?? via R-help wrote:> Garbled characters appear in Chinese annotation when opening program script using RGui (see attached picture). I use a variety of methods have not been solved, I hope to help me solve this problem. Thank you. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Bill Dunlap
2022-Apr-25 15:52 UTC
[R] about opening R script Chinese annotation garble problem
The answer depends on the encoding of the file containing the Chinese
characters and on the version of R (since you are using Windows). I copied
your subject line into Wordpad and and added some syntax to make a valid R
expression
s <- "?? via R-help"
I then saved it with the type "Unicode Text Document". In my version
of
Wordpad this means UTF-16. The bytes in the file are
4.2.0> readBin("Chinese-utf-16.txt", what="raw",
n=file.size("Chinese-utf-16.txt"))
[1] ff fe 73 00 20 00 3c 00 2d 00 20 00 22 00 38 6c 1b 52
[19] 20 00 76 00 69 00 61 00 20 00 52 00 2d 00 68 00 65 00
[37] 6c 00 70 00 22 00 0d 00 0a 00
All the nulls in the file are a hint that this is encoded using UTF-16, not
UTF-8.
With R-4.2.0 (released a few days ago) I can source the file with
4.2.0> source("Chinese-utf-16.txt", encoding="UTF-16")
4.2.0> s
[1] "?? via R-help"
4.2.0> Encoding(s)
[1] "UTF-8"
With R-4.1.2 I get
> source("Chinese-utf-16.txt", encoding="UTF-16")
Error in source("Chinese-utf-16.txt", encoding = "UTF-16")
:
Chinese-utf-16.txt:1:6: unexpected INCOMPLETE_STRING
1: s <- "
^
In addition: Warning message:
In readLines(file, warn = FALSE) :
invalid input found on input connection 'Chinese-utf-16.txt'
> source(file("Chinese-utf-16.txt", encoding="UTF-16"))
> s
[1] "<U+6C38><U+521B> via R-help"
> source(file("Chinese-utf-16.txt", encoding="UTF-16"),
encoding="UTF-8")
> s
[1] "?? via R-help"
> Encoding(s)
[1] "UTF-8"
> charToRaw(s)
[1] e6 b0 b8 e5 88 9b 20 76 69 61 20 52 2d 68 65 6c 70
R-4.2.0 makes this much easier.
-Bill
On Mon, Apr 25, 2022 at 1:04 AM ?? via R-help <r-help at r-project.org>
wrote:
> Garbled characters appear in Chinese annotation when opening program
> script using RGui (see attached picture). I use a variety of methods have
> not been solved, I hope to help me solve this problem. Thank you.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]