Ravi S. Shankar
2008-Oct-20 07:26 UTC
[R] convert matrix to dataframe with repeating row names
Hi R, I have a matrix x with repeating row names.> dim(x)[1] 862 19 zz<-matrix(0,4,4) rownames(zz)=c("a","a","b","b") data.frame(zz) (?) I need to use x in a linear regression lm(as.formula(paste("final_dat[,5]~",paste(colnames(x),collapse="+"))),x ) this gives me a error Error in model.frame.default(formula as.formula(paste("final_dat[,5]~", : 'data' must be a data.frame, not a matrix or an array> sessionInfo()R version 2.7.1 (2008-06-23) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] xlsReadWrite_1.3.2 Thanks in advance Ravi This e-mail may contain confidential and/or privileged i...{{dropped:13}}
Richard.Cotton at hsl.gov.uk
2008-Oct-20 09:29 UTC
[R] convert matrix to dataframe with repeating row names
> I have a matrix x with repeating row names.> zz<-matrix(0,4,4) > > rownames(zz)=c("a","a","b","b") > > data.frame(zz) (?)The row names on a data frame should be unique. You can try as.data.frame(xx, row.names=FALSE) to convert zz to be a data frame. If you need the row name information, add it as a column in the data frame, e.g. mydataframe$rnames <- rownames(zz). (Note to R-Core: the documentation for as.data.frame doesn't mention the usage of row.names=FALSE to ignore row names, but it seems to work consistently. Does the help page for as.data.frame need updating?)> lm(as.formula(paste("final_dat[,5]~",paste(colnames(x),collapse="+"))),x > ) > > this gives me a error> Error in model.frame.default(formula > as.formula(paste("final_dat[,5]~", : > > 'data' must be a data.frame, not a matrix or an arrayI suspect that if you try class(x), it will be a matrix, not the requisite data frame. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}
Richard.Cotton at hsl.gov.uk
2008-Oct-20 10:47 UTC
[R] convert matrix to dataframe with repeating row names
> > The row names on a data frame should be unique. You can try > > as.data.frame(xx, row.names=FALSE) to convert zz to be a data frame.If> > you need the row name information, add it as a column in the dataframe,> > e.g. mydataframe$rnames <- rownames(zz). (Note to R-Core: the > > documentation for as.data.frame doesn't mention the usage of > > row.names=FALSE to ignore row names, but it seems to workconsistently.> > Does the help page for as.data.frame need updating?) > > No. row.names=FALSE is not intended to work, and did you check every > single as.data.frame() method? > > It just so happens that for the matrix method invalid input for > 'row.names' results in setting default row names. Other methods may > differ.row.names=FALSE seems a natural way of supressing existing row names to me, since it corresponds nicely to using row.names=FALSE in write.csv. Currently it seems that if a matrix has duplicate row names, then converting it to be a data frame requires rnames <- rownames(mymatrix) rownames(mymatrix) <- NULL as.data.frame(mymatrix) rownames(mymatrix) <- rnames Ideally, three of these lines of code shouldn't really need to be there. If you disagree that allowing row.names=FALSE is a good idea, or you don't want to change the function interface, then perhaps having as.data.frame check for duplicates and throwing a warning (rather than an error) would be preferable behaviour. I do realise that there are dozens of as.data.frame methods, and the documentation does state that "Few of the methods check for duplicated row names", but it would be beneficial from a user standpoint. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}