Subsetting from a dataframe with only one variable returns a vector, not a dataframe. This seems somewhat inconsistent. Wouldn't it be better if subsetting would respect the structure completely? v1<-1:4 v2<-4:1 df1<-data.frame(v1) df2<-data.frame(v1,v2) sel1<-c(TRUE,TRUE,TRUE,TRUE)> df1[sel1,][1] 1 2 3 4> df2[sel1,]v1 v2 1 1 4 2 2 3 3 3 2 4 4 1 -- Erich Neuwirth Institute for Scientific Computing and Didactic Center for Computer Science University of Vienna phone: +43-1-4277-39464 fax: +43-1-4277-39459
> Subsetting from a dataframe with only one variable > returns a vector, not a dataframe. > This seems somewhat inconsistent. > Wouldn't it be better if subsetting would respect > the structure completely? > > > v1<-1:4 > v2<-4:1 > df1<-data.frame(v1) > df2<-data.frame(v1,v2) > sel1<-c(TRUE,TRUE,TRUE,TRUE) > > > df1[sel1,]df1[[sel1, , drop=FALSE] Should do what you want. Best, Matthias> [1] 1 2 3 4 > > df2[sel1,] > v1 v2 > 1 1 4 > 2 2 3 > 3 3 2 > 4 4 1 > > -- > Erich Neuwirth > Institute for Scientific Computing and > Didactic Center for Computer Science > University of Vienna > phone: +43-1-4277-39464 fax: +43-1-4277-39459 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read > the posting guide! http://www.R-project.org/posting-guide.html >
On Wed, 11 Jan 2006, Erich Neuwirth wrote:> Subsetting from a dataframe with only one variable > returns a vector, not a dataframe. > This seems somewhat inconsistent.Not at all. It is entirely consistent with matrix-like indexing (the form you used).> Wouldn't it be better if subsetting would respect > the structure completely?It depends how you do it. [sel1,] parallels a matrix, and drops dimensions unless drop == FALSE is supplied. [sel1] returns a one-column df, and [[sel1]] returns a vector. It is just a question of choosing the appropriate tool. And any changes to this sort of thing (from the White book) would break a lot of careful code.> > > v1<-1:4 > v2<-4:1 > df1<-data.frame(v1) > df2<-data.frame(v1,v2) > sel1<-c(TRUE,TRUE,TRUE,TRUE) > >> df1[sel1,] > [1] 1 2 3 4 >> df2[sel1,] > v1 v2 > 1 1 4 > 2 2 3 > 3 3 2 > 4 4 1 > > -- > Erich Neuwirth > Institute for Scientific Computing and > Didactic Center for Computer Science > University of Vienna > phone: +43-1-4277-39464 fax: +43-1-4277-39459-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
> df1v1 1 1 2 2 3 3 4 4> df1[,][1] 1 2 3 4> df1[,1][1] 1 2 3 4> df1[,,drop=F]v1 1 1 2 2 3 3 4 4> df1[,1,drop=F]v1 1 1 2 2 3 3 4 4> df1[1]v1 1 1 2 2 3 3 4 4> df1[[1]][1] 1 2 3 4>For transfers from Excel to R using the "[put/get] R dataframe" commands, I think it is important always to use the drop=FALSE argument (as I assume you are doing in RExcel V1.55). The reason for this is to maintain a rigid relationship between the only partially compatible conventions of Excel and R. For strictly within R use, the case is less clear. I have trained myself always (well 85% on the first try) to use the drop=FALSE argument when I care about the structure after the copy. The tension between keeping the structure and demoting the structure predates data.frames. This was a design issue in matrices as well.> tmp <- matrix(1:6,2,3) > tmp[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6> tmp[1,][1] 1 3 5> tmp[1,,drop=FALSE][,1] [,2] [,3] [1,] 1 3 5>Rich