Martin Morgan
2006-Sep-27 13:13 UTC
[Rd] colnames is slow for data.frames with implicit row.names
colnames on a data.frame with implicit row.names> df <- data.frame(x=1:6000000)is slow> system.time(colnames(df))[1] 21.655 0.327 21.987 0.000 0.000> system.time(names(df))[1] 0 0 0 0 0 because colnames calls dimnames calls row.names.data.frame calls as.character on the implicit row.names. -- Martin T. Morgan Bioconductor / Computational Biology http://bioconductor.org
Prof Brian Ripley
2006-Sep-27 13:30 UTC
[Rd] colnames is slow for data.frames with implicit row.names
On Wed, 27 Sep 2006, Martin Morgan wrote:> colnames on a data.frame with implicit row.names > >> df <- data.frame(x=1:6000000) > > is slow > >> system.time(colnames(df)) > [1] 21.655 0.327 21.987 0.000 0.000 >> system.time(names(df)) > [1] 0 0 0 0 0 > > because colnames calls dimnames calls row.names.data.frame calls > as.character on the implicit row.names.So use names() and not colnames(): rownames and colnames for matrices row.names and names for data frames. All colnames assumes is that there is a dimnames method: this could be relevant for objects inheriting from "data.frame", but there is a price for generality. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595