The help text for row+colnames {base} states: "For a data frame, rownames and colnames eventually call row.names and names respectively, but the latter are preferred." Why are they "preferred"? Why is it names(), not col.names()? I have only ever used names() for vectors - I'm surprised it works on data.frames... IMO this is not great for code readability, thus thinking to require rownames(), colnames() for all 2D objects, names() for vectors and lists. Any problems with this approach? Thanks for some insight! Boris
Jeff Newmiller
2016-Apr-03 01:11 UTC
[R] row.names(), rownames(), colnames(), names() ...?
Data frames are lists of columns. The names() function is appropriate for lists. It doesn't pay to fall into the trap of thinking that data frames are truly symmetric between columns and rows, because there is a performance penalty for accessing rows that is greater than the cost of accessing columns. With that in mind, thinking of data frames as lists is preferred, so names is preferred over colnames. -- Sent from my phone. Please excuse my brevity. On April 2, 2016 5:54:10 PM PDT, Boris Steipe <boris.steipe at utoronto.ca> wrote:>The help text for row+colnames {base} states: > > "For a data frame, rownames and colnames eventually call row.names > and names respectively, but the latter are preferred." > >Why are they "preferred"? >Why is it names(), not col.names()? >I have only ever used names() for vectors - I'm surprised it works on >data.frames... IMO this is not great for code readability, thus >thinking to require rownames(), colnames() for all 2D objects, names() >for vectors and lists. Any problems with this approach? > > >Thanks for some insight! >Boris >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Ah, that makes immediate sense. On Apr 2, 2016, at 9:11 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> Data frames are lists of columns. The names() function is appropriate for lists. > > It doesn't pay to fall into the trap of thinking that data frames are truly symmetric between columns and rows, because there is a performance penalty for accessing rows that is greater than the cost of accessing columns.Interesting, I didn't know that.> With that in mind, thinking of data frames as lists is preferred, so names is preferred over colnames.I see. Thinking about data frames like that has the added benefit that this matches how we describe entities in relational datamodels. Both then turn out to be the transpose of the typical spreadsheet. Thanks Jeff> -- > Sent from my phone. Please excuse my brevity. > > On April 2, 2016 5:54:10 PM PDT, Boris Steipe <boris.steipe at utoronto.ca> wrote: > The help text for row+colnames {base} states: > > "For a data frame, rownames and colnames eventually call row.names > and names respectively, but the latter are preferred." > > Why are they "preferred"? > Why is it names(), not col.names()? > I have only ever used names() for vectors - I'm surprised it works on data.frames... IMO this is not great for code readability, thus thinking to require rownames(), colnames() for all 2D objects, names() for vectors and lists. Any problems with this approach? > > > Thanks for some insight! > Boris > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >