On 05/08/2012 09:54, Milan Bouchet-Valat wrote:> Hi!
>
> I'm using R2HTML in my RcmdrPlugin.temis package to output localized
> strings to a HTML file. Thus, I insert a simple header at the top of the
> file to specify what encoding is used; if I don't do that, Web browsers
> assume it is latin1, which is not always true.
>
> My problem is, I could not find a way to detect what encoding is used by
> R2HTML in the most general case. R2HTML simply calls cat() with the file
> name, which means the text connection is opened using file(encoding >
getOption("encoding")). This is fine, except that when
> getOption("encoding")) is set to "native.enc", I'm
not able to find out
> the real encoding that was used for output.
>
> Of course, ideally I would tell R2HTML to output everything as UTF-8,
> and I would add this information to the header. But AFAICT this is not
> possible in the current state of this package. So I would be very
> grateful if somebody could provide me with a solution to resolve
> "native.enc" to the encoding name.
?options points you to ?connections, which does explain this. See
Sys.getlocale("LC_CTYPE") to see
'the internal encoding of the current locale'
(or at least, what the OS claims it to be: e.g. some lie about 'C'
locales).
As for a name, iconv() knows this as "" (and some OSes do make it
rather
hard to find a name if it is not part of the locale name).
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595