Erich Studerus
2010-May-12 12:37 UTC
[R] Input encoding problem when using sweave with xetex
Hello Because I want to use different true type fonts with latex, I''m using the XeTeX typesetting engine for my sweave-documents. I''m using Lyx with Sweave on a Windows 7 PC and have set up LyX to work with XeTeX according to the following instructions: http://wiki.lyx.org/LyX/XeTeX Because the input file for XeTeX is assumed to be in UTF-8 encoding, I set the encoding under LyX - Tools - Language Settings - Language to "Unicode (XeTeX) (utf8)". Accented letters that I write into the LyX-document are correctly typeset in the final PDF-document. However, character strings with accented letters that are read from Excel-files or other sources from within R during the LyX-Sweave document compilation are not. For instance, the German umlauts of the following example are not correctly typeset, when "Unicode (XeTeX) (utf8)" is used as input encoding. <<echo=F>> require(gdata) x <- read.xls("http://www.schwerhoerigkeit.pop.ch/hoergeraete_test.xls", stringsAsFactors = F)[2,2] x @ I do not have this problem with a Mac computer . I guess, this is because R under Windows does not use UTF-8 encoding. I tried to change the encoding within R by doing the following <<echo=F>> Encoding(x) <- ''UTF-8'' x @ Unfortunately, this does not work. Does anybody have solution for this problem? Regards, Erich [[alternative HTML version deleted]]
Duncan Murdoch
2010-May-12 13:26 UTC
[R] Input encoding problem when using sweave with xetex
On 12/05/2010 8:37 AM, Erich Studerus wrote:> Hello > > > > Because I want to use different true type fonts with latex, I'm using the > XeTeX typesetting engine for my sweave-documents. I'm using Lyx with Sweave > on a Windows 7 PC and have set up LyX to work with XeTeX according to the > following instructions: > > http://wiki.lyx.org/LyX/XeTeX > > > > Because the input file for XeTeX is assumed to be in UTF-8 encoding, I set > the encoding under LyX - Tools - Language Settings - Language to "Unicode > (XeTeX) (utf8)". Accented letters that I write into the LyX-document are > correctly typeset in the final PDF-document. However, character strings with > accented letters that are read from Excel-files or other sources from within > R during the LyX-Sweave document compilation are not. For instance, the > German umlauts of the following example are not correctly typeset, when > "Unicode (XeTeX) (utf8)" is used as input encoding. > > > > <<echo=F>>> > require(gdata) > > x <- read.xls("http://www.schwerhoerigkeit.pop.ch/hoergeraete_test.xls", > stringsAsFactors = F)[2,2] > > x > > @ > > > > I do not have this problem with a Mac computer . I guess, this is because R > under Windows does not use UTF-8 encoding. I tried to change the encoding > within R by doing the following > > > > <<echo=F>>> > Encoding(x) <- 'UTF-8' > > x > > @ > > > > Unfortunately, this does not work. Does anybody have solution for this > problem? >You need to use iconv() to change an encoding. What you did just changes the declared encoding, but doesn't actually change any bits. So you'd probably get what you want with x <- iconv(x, "", "UTF-8") x (though you may need to declare the input encoding; it is likely CP1252 on Windows).> Duncan Murdoch > > > Regards, > > Erich > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >