There have been comments from time to time (over many years) on the inefficiency of the storage of character vectors in R, and R-core has been looking into the issues. We have some ideas but they would be a considerable amount of work to implement and it is unclear if they would actually help with current real-world problems. One example was the storage of integer row names for data frames, but such row names are stored much more efficiently in R-devel (2.4.0-to-be). We do have some other examples but these are highly artificial. What we would like is some real-world examples of problems in which users have found the storage of character vectors to be an appreciable problem. Ideally we want concrete reproducible examples that show the problem in R-devel, but abstractions of such examples (for example using synthetic rather than real data) would also be very helpful. If you can help, please do so by replying to this thread (and making examples available via URLs would probably be the most efficient route). -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595