Hi! I've been observing the recent SVN log entries about encoding information in CHARSXPs with great interest. This looks like a very nice addition. While this is still work in progress, I'd like to suggest the following extra: At least in RKWard, all shown strings need to be converted to UTF-8 (the internal storage format used in Qt QStrings). This needs to be done independent of the current locale, and the encoding used in the embedded R process. I imagine other graphical or non-graphical toolkits will similarly use UTF-8 to store strings, internally. For this reason, an addition of e.g. char* Rf_translateCharToUTF8(SEXP); would be nice. This function would translate to UTF-8 independently of the current LC_CTYPE. While it is possible to achieve the same effect by first translating the strings to the current LC_CTYPE encoding (using Rf_translateChar()), and then translate to UTF-8 in a second step (using custom means, if needed), being able to do this conversion in a single step would be more elegant, and also potentially avoid expensive recoding steps. Alternatively, having access to the IS_UTF8 and IS_LATIN1 macros from C would be good enough to hand-code efficient conversion to UTF-8 (but may be too close to the internals). Not sure, whether this is considered important enough to warant inclusion in the API, but I just wanted to throw in the idea in time. Regards Thomas Friedrichsmeier -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : https://stat.ethz.ch/pipermail/r-devel/attachments/20070215/9f70fc9c/attachment.bin