Philippe Grosjean
2004-Jan-09 09:12 UTC
[R] Wich character coding for source under Windows?
I know that R can cope with the different formats regarding carriage return and/or line feed (the Unix, or Windows, or Mac convention), which is very nice. However, it is not clear in my mind which character encoding is used: ASCII, ANSI, other? There is not much differences between ANSI and DOS encoding for instance, for the first 128 characters. But it is very different for the rest. Best, Philippe Grosjean .......................................................<?}))><.... ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean \ ___ ) \/ECO\ ( Numerical Ecology of Aquatic Systems /\___/ ) Mons-Hainaut University, Pentagone / ___ /( 8, Av. du Champ de Mars, 7000 Mons, Belgium /NUM\/ ) \___/\ ( phone: + 32.65.37.34.97, fax: + 32.65.37.33.12 \ ) email: Philippe.Grosjean at umh.ac.be ) ) ) ) ) SciViews project coordinator (http://www.sciviews.org) ( ( ( ( ( ...................................................................
Prof Brian Ripley
2004-Jan-09 09:55 UTC
[R] Wich character coding for source under Windows?
Unless you change it, no encoding is used. That is, characters are just treated as 8-bit numbers (as they are in all C programs). Encodings are only relevant if you want to display a character (or type at a keyboard), and in general R assumes that you have set your fonts and keyboard to a single consistent encoding (which Petr Pikal had not). You can reencode on input (See ?connections) and on output where there is an encoding step (see ?postscript). So if you have Mac files you can reencode them on read transparently. What you can't do is to re-encode text files on output, mainly because there is no way to mark such files are encoded. On Fri, 9 Jan 2004, Philippe Grosjean wrote:> I know that R can cope with the different formats regarding carriage return > and/or line feed (the Unix, or Windows, or Mac convention), which is very > nice. However, it is not clear in my mind which character encoding is used: > ASCII, ANSI, other? There is not much differences between ANSI and DOS > encoding for instance, for the first 128 characters. But it is very > different for the rest.I don't believe there is a single `DOS' encoding, rather a whole series of codepages. And ASCII is a 7-bit encoding. There are various wide encodings out there too. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595