Cc'ing Kurt since the version control history shows he brought it in a
few years ago: https://github.com/wch/r-source/commit/53d4b432f7
The fix can be fairly simple if someone has one minute:
lc_ctype <- Sys.getlocale("LC_CTYPE")
on.exit(Sys.setlocale("LC_CTYPE", lc_ctype), add = TRUE)
Sys.setlocale("LC_CTYPE", "C")
although I do not really understand why LC_CTYPE has to be changed to
"C".
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Web: http://yihui.name
On Wed, May 14, 2014 at 4:34 PM, Yihui Xie <xie at yihui.name>
wrote:> Hi,
>
> read.dcf() can modify the locale variable LC_CTYPE, and here is a
> minimal example:
>
>> Sys.getlocale('LC_CTYPE')
> [1] "en_US.UTF-8"
>> read.dcf(textConnection('a: b'), all = TRUE)
> a
> 1 b
>> Sys.getlocale('LC_CTYPE')
> [1] "C"
>
> After diagnosing the problem, it seems the on.exit() call in
> read.dcf() is the culprit:
>
> on.exit(Sys.setlocale("LC_CTYPE",
Sys.getlocale("LC_CTYPE")), add = TRUE)
> Sys.setlocale("LC_CTYPE", "C")
>
>
https://github.com/wch/r-source/blob/96a2cc920/src/library/base/R/dcf.R#L68-L69
>
>> sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> LC_TIME=en_US.UTF-8
> [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_3.1.0
>
>
> Regards,
> Yihui
> --
> Yihui Xie <xieyihui at gmail.com>
> Web: http://yihui.name