Stephan Kolassa
2010-Jan-12 11:25 UTC
[R] Reading a file with mixed cyrillic/latin characters
Dear useRs,
I am trying to read a tab-delimited Unicode text file containing both
latin and cyrillic characters and failing miserably. The file looks like
this (I hope it comes across right):
A B C
3 foo ???
5 bar ???
read.table("foo.txt",sep="\t",header=TRUE)
I am guessing that I can use the fileEncoding argument to read.table()
to read this, but I can find no list of supported values of
fileEncoding, and fileEncoding="Unicode" gives an error.
The FAQ and the FAQ for Windows don't help. I have searched both the
list archives and RSeek and am still seeking enlightenment. I am running
R 2.10.1 on Windows XP, sessionInfo() below.
Cheers
Stephan
R version 2.10.1 (2009-12-14)
i386-pc-mingw32
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
Seemingly Similar Threads
- Sweave: looping over mixed R/LaTeX code
- R-2.14.0: read.csv2 with fileEncoding="UTF-8"
- ggplot2: deterministic position_jitter & geom_line with position_jitter
- Search generator and non-latin characters
- Question on Stopword Removal from a Cyrillic (Bulgarian)Text
