Arne.Muller@aventis.com
2003-Oct-18 10:33 UTC
[R] why does data frame subset return vector
Hello, I've a weired problem with a data frame. Basically it should be just one column with specific names coming from a data file (the file contains 2 rows, one should be the for the rownames of the data frame the other contains numeric values).> df.rr <- read.table("RR_anova.txt", header=T, comment.char="", row.names=1) > df.rr[c(1,2,3),][1] 1.11e-16 1.11e-16 1.11e-16 Why are the rownames not displayed? The data file itself look slike this:> df.rr <- read.table("RR_anova.txt", header=T, comment.char="") > df.rr[c(1,2,3),]QUAL PVALUE 1 AJ224120_at 1.11e-16 2 rc_AA893000_at 1.11e-16 3 rc_AA946368_at 1.11e-16 and assigning the rownames explicitely works as I'd expect:> rownames(df.rr) <- df.rr$'QUAL' > df.rr[c(1,2,3),]QUAL PVALUE AJ224120_at AJ224120_at 1.11e-16 rc_AA893000_at rc_AA893000_at 1.11e-16 rc_AA946368_at rc_AA946368_at 1.11e-16 Ok, now they are displayed, but it's a duplication to keep the "QUAL" colum. below I create the a new data frame to skip the "QUAL" column, since it is already a rowname.> df.rr2 <- data.frame(PVALUE=df.rr, row.names=1) > df.rr2[1:4,][1] 1.11e-16 1.11e-16 1.11e-16 1.11e-16 However, the rowname is still there ..., you just cannot see it:> df.rr2["AJ224120_at",][1] 1.11e-16 The code below shows that "sub-setting" the df.rr data frame in deed creates a vector rather than a data frame whereas sub-setting the 2 column data frame returns a new data frame (as I'd expect).> df.rr[1:4,][1] 1.11e-16 1.11e-16 1.11e-16 1.11e-16> is.vector(df.rr[1:4,])[1] TRUE> is.data.frame(df.rr[1:4,])[1] FALSE> df.rr <- read.table("CLO_RR_anova.txt", header=T, comment.char="") > is.data.frame(df.rr[1:4,])[1] TRUE Any explanation is appreciated. There must be a good reason for this I guess ... . On the other hand is there a way to fore the subset of the 1 colum data frame to be dataframe itself? I'd just like to see the rownames displayed, that's it ... thanks alot for your help, Arne
Do look at the discussion of the drop argument in ?"[.data.frame". It is all in TFM, and there is nothing `wiered' about it. On Sat, 18 Oct 2003 Arne.Muller at aventis.com wrote:> I've a weired problem with a data frame. Basically it should be just one > column with > specific names coming from a data file (the file contains 2 rows, one should > be > the for the rownames of the data frame the other contains numeric values). > > > df.rr <- read.table("RR_anova.txt", header=T, comment.char="", row.names=1) > > df.rr[c(1,2,3),] > [1] 1.11e-16 1.11e-16 1.11e-16 > > Why are the rownames not displayed?Because data frames have row names (with a space) and vectors do not. [...]> Any explanation is appreciated. There must be a good reason for this I guess > ... . On > the other hand is there a way to fore the subset of the 1 colum data frame to > be > dataframe itself? I'd just like to see the rownames displayed, that's it ...The help page will tell you how. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595