Following the report of Henrik Bengtsson which was traced by Duncan Murdoch to a bug in the regexp code from glibc-2.3.3 in R, I have updated the code in R-devel to -2.3.5. This had that bug fixed and about 700 other changes, most of which seem to be bug fixes. My tests on Linux (including of all CRAN packages) have detected no differences (but then I was not able to reproduce Henrik's problem on Linux, only on Windows). If anyone finds a difference that they think is not a bug fix please let me know. There are also quite a few new optimizations, especially in UTF-8 locales. It seems somewhat clear that the codes available for multi-byte character set regexps are not yet mature. You might want to remember that if you use a GNU/Linux system which is not bang up-to-date (many are older than even glinc-2.3.3). -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595