thr3ads.net - R devel - [Rd] Want non-ASCII characters in data package [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Kevin R. Coombes

2010-Dec-29 16:45 UTC

[Rd] Want non-ASCII characters in data package

Hi,

I have a data frame that includes several names that (if typeset 
correctly) require accented characters not available in the ASCII 
character set.

I'd like to include this data frame as example data in an R package.  
I'd also like the R CMD check warning about the use of non-ASCII 
characters to go away, in part so I could submit the package somewhere 
that wouldn't balk at the presence of the warning.  (I gather from older 
posts that there are environment variables to skip this check.  Those 
will work for me personally but will not necessarily appease the 
maintainers of sites like CRAN where I might want to submit the package.)

Is there any way to use the correctly accented characters by setting a 
different character encoding or equivalent for the data frame? Or am I 
forced to remove the offending accents in order to be ASCII-pure and 
thus leave people and places with an incorrect representation of their 
names?

     Kevin

Prof Brian Ripley

2011-Jan-01 10:58 UTC

head link

[Rd] Want non-ASCII characters in data package

On Wed, 29 Dec 2010, Kevin R. Coombes wrote:
> Hi,
>
> I have a data frame that includes several names that (if typeset correctly)
> require accented characters not available in the ASCII character set.
>
> I'd like to include this data frame as example data in an R package. 
I'd
> also like the R CMD check warning about the use of non-ASCII characters to
go
> away, in part so I could submit the package somewhere that wouldn't
balk at
> the presence of the warning.  (I gather from older posts that there are 
> environment variables to skip this check.  Those will work for me
personally
> but will not necessarily appease the maintainers of sites like CRAN where I
> might want to submit the package.)
>
> Is there any way to use the correctly accented characters by setting a 
> different character encoding or equivalent for the data frame? Or am I
forced
> to remove the offending accents in order to be ASCII-pure and thus leave 
> people and places with an incorrect representation of their names?
The latter is inevitable.  There is no encoding that will work 
correctly for everyone (see 'Writing R Extensions' ?1.7.1): e.g. 
Chinese Windows users have only ASCII and Chinese characters (and only 
one of two sets of Chinese characters).  Again, good practice and 
compromises are discussed in 'Writing R Extensions' -- these days 
using UTF-8 will do a good job for most R users.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Possibly Parallel Threads

Search for more reasonably related threads

R devel - Dec 2010 - Want non-ASCII characters in data package

[Rd] Want non-ASCII characters in data package

[Rd] Want non-ASCII characters in data package

Possibly Parallel Threads