On Wed, 29 Dec 2010, Kevin R. Coombes wrote:
> Hi,
>
> I have a data frame that includes several names that (if typeset correctly)
> require accented characters not available in the ASCII character set.
>
> I'd like to include this data frame as example data in an R package.
I'd
> also like the R CMD check warning about the use of non-ASCII characters to
go
> away, in part so I could submit the package somewhere that wouldn't
balk at
> the presence of the warning. (I gather from older posts that there are
> environment variables to skip this check. Those will work for me
personally
> but will not necessarily appease the maintainers of sites like CRAN where I
> might want to submit the package.)
>
> Is there any way to use the correctly accented characters by setting a
> different character encoding or equivalent for the data frame? Or am I
forced
> to remove the offending accents in order to be ASCII-pure and thus leave
> people and places with an incorrect representation of their names?
The latter is inevitable. There is no encoding that will work
correctly for everyone (see 'Writing R Extensions' ?1.7.1): e.g.
Chinese Windows users have only ASCII and Chinese characters (and only
one of two sets of Chinese characters). Again, good practice and
compromises are discussed in 'Writing R Extensions' -- these days
using UTF-8 will do a good job for most R users.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595