(Using R1.5.0 patched for Windows) Subscripting rows of data.frame only sometimes does what the documentation says:> When `[.data.frame' is used for subsetting rows of a `data.frame', > it returns a dataframe with unique row names, using `make.names( * > , unique = TRUE)', see the `swiss' example below.Row names seem to be modified in this way only when row indices pick out non-existing or non-unique rows. Splus6.0 for Windows has similar behavior, though the details are different (modification of row names is triggerred only by non-unique indices, including zeros, in Splus). However, at least the R documentation does say *something* about row names being modified by make.names. [Personally, I'd rather have a version of the dataframe class whose methods didn't go around modifying the row and column names I had assigned...] Note that the documentation could be considered correct if it is assumed that the documentation only applies in the cases where the input data frame has row names that are syntactically valid variable names (i.e., will pass through make.names() untouched). However, I couldn't find anything in R or S-plus online documentation that said this is a requirement for data frames. > data.frame(x=1:3,y=4:6,row.names=c("3AB","C/D","E.F"))[c(1,2,3),] x y 3AB 1 4 C/D 2 5 E.F 3 6 > data.frame(x=1:3,y=4:6,row.names=c("3AB","C/D","E.F"))[c(1,2,3,NA),] x y X3AB 1 4 C.D 2 5 E.F 3 6 NA NA NA > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c(1,2,3,0),] x y AB 1 4 C/D 2 5 E.F 3 6 > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c(1,2,3,4),] x y AB 1 4 C.D 2 5 E.F 3 6 NA NA NA > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c(1,1,2,3),] x y AB 1 4 AB1 1 4 C.D 2 5 E.F 3 6 > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c("AB","C/D","E.F"),] x y AB 1 4 C/D 2 5 E.F 3 6 > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c("AB","C/D","E.F","NA"),] x y AB 1 4 C.D 2 5 E.F 3 6 NA NA NA > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c("AB","C/D","E.F",NA),] x y AB 1 4 C.D 2 5 E.F 3 6 NA NA NA > version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status Patched major 1 minor 5.0 year 2002 month 05 day 16 language R -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
ripley@stats.ox.ac.uk
2002-Jun-05 07:23 UTC
[R] documentation inconsistency for [.data.frame ?
On Tue, 4 Jun 2002, Tony Plate wrote:> (Using R1.5.0 patched for Windows) > Subscripting rows of data.frame only sometimes does what the documentation > says: > > > When `[.data.frame' is used for subsetting rows of a `data.frame', > > it returns a dataframe with unique row names, using `make.names( * > > , unique = TRUE)', see the `swiss' example below. > > Row names seem to be modified in this way only when row indices pick out > non-existing or non-unique rows.Right, and that's exactly what it says. The rownames in a data frame started out as unique and non-missing, and those are the ways that you can break that. I think you are mis-reading the sentence.> Splus6.0 for Windows has similar behavior, though the details are different > (modification of row names is triggerred only by non-unique indices, > including zeros, in Splus). However, at least the R documentation does say > *something* about row names being modified by make.names.Well, that system does not have missing character strings, so the rule is slightly different.> [Personally, I'd rather have a version of the dataframe class whose methods > didn't go around modifying the row and column names I had assigned...]Impossible. Unique row names are part of the class definition. If you insist on using operations that violate that, you should expect to be corrected.> Note that the documentation could be considered correct if it is assumed > that the documentation only applies in the cases where the input data frame > has row names that are syntactically valid variable names (i.e., will pass > through make.names() untouched). However, I couldn't find anything in R or > S-plus online documentation that said this is a requirement for data frames.It is a requirement that they be unique. It is not a requirement that they be syntactically valid. The latter is just how R corrects your mistakes.> > data.frame(x=1:3,y=4:6,row.names=c("3AB","C/D","E.F"))[c(1,2,3),] > x y > 3AB 1 4 > C/D 2 5 > E.F 3 6 > > data.frame(x=1:3,y=4:6,row.names=c("3AB","C/D","E.F"))[c(1,2,3,NA),] > x y > X3AB 1 4 > C.D 2 5 > E.F 3 6 > NA NA NA > > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c(1,2,3,0),] > x y > AB 1 4 > C/D 2 5 > E.F 3 6 > > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c(1,2,3,4),] > x y > AB 1 4 > C.D 2 5 > E.F 3 6 > NA NA NA > > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c(1,1,2,3),] > x y > AB 1 4 > AB1 1 4 > C.D 2 5 > E.F 3 6 > > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c("AB","C/D","E.F"),] > x y > AB 1 4 > C/D 2 5 > E.F 3 6 > > > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c("AB","C/D","E.F","NA"),] > x y > AB 1 4 > C.D 2 5 > E.F 3 6 > NA NA NA > > > data.frame(x=1:3,y=4:6,row.names=c("AB","C/D","E.F"))[c("AB","C/D","E.F",NA),] > x y > AB 1 4 > C.D 2 5 > E.F 3 6 > NA NA NA > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status Patched > major 1 > minor 5.0 > year 2002 > month 05 > day 16 > language R > > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._