Dennis Fisher
2005-Sep-08 12:22 UTC
[R] Converting a matrix to a dataframe: how to prevent conversion to factor
Colleages I am running R 2.1.0 on a Mac (same problem occurs in Linux). In some situations, I have mixed text/numeric data that is stored as characters in a matrix. If I convert this matrix to a dataframe, the numeric data becomes factors, not what I intend. TEXT <- paste("Text", 1:4, sep="") NUMBERS <- 10 + 4:1 MATRIX <- cbind(TEXT, NUMBERS) FRAME <- as.data.frame(MATRIX) > str(FRAME) `data.frame': 4 obs. of 2 variables: $ TEXT : Factor w/ 4 levels "Text1","Text2",..: 1 2 3 4 $ NUMBERS: Factor w/ 4 levels "11","12","13",..: 4 3 2 1 One work-around is to write the matrix (or the dataframe) to a file, then read the file back using the as.is argument. write.table(MATRIX, "JUNK", row.names=F) NEWFRAME <- read.table("JUNK", as.is=T, header=T) > str(NEWFRAME) `data.frame': 4 obs. of 2 variables: $ TEXT : chr "Text1" "Text2" "Text3" "Text4" $ NUMBERS: int 14 13 12 11 This restores the NUMBERS to their intended mode (integers, not factors). The text column is also not read as a factor (not a problem for me). It appears that the function AsIs [I(x)] would enable me to accomplish this without the write/read steps. However, it is not obvious to me how to implement I(x). Can anyone advise? Thanks in advance. Dennis Fisher Dennis Fisher MD P < (The "P Less Than" Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]]
Peter Dalgaard
2005-Sep-08 12:51 UTC
[R] Converting a matrix to a dataframe: how to prevent conversion to factor
Dennis Fisher <fisher at plessthan.com> writes:> Colleages > > I am running R 2.1.0 on a Mac (same problem occurs in Linux). In > some situations, I have mixed text/numeric data that is stored as > characters in a matrix. If I convert this matrix to a dataframe, the > numeric data becomes factors, not what I intend. > > TEXT <- paste("Text", 1:4, sep="") > NUMBERS <- 10 + 4:1 > MATRIX <- cbind(TEXT, NUMBERS) > FRAME <- as.data.frame(MATRIX) > > > str(FRAME) > `data.frame': 4 obs. of 2 variables: > $ TEXT : Factor w/ 4 levels "Text1","Text2",..: 1 2 3 4 > $ NUMBERS: Factor w/ 4 levels "11","12","13",..: 4 3 2 1 > > One work-around is to write the matrix (or the dataframe) to a file, > then read the file back using the as.is argument. > write.table(MATRIX, "JUNK", row.names=F) > NEWFRAME <- read.table("JUNK", as.is=T, header=T) > > > str(NEWFRAME) > `data.frame': 4 obs. of 2 variables: > $ TEXT : chr "Text1" "Text2" "Text3" "Text4" > $ NUMBERS: int 14 13 12 11 > > This restores the NUMBERS to their intended mode (integers, not > factors). The text column is also not read as a factor (not a > problem for me). > > It appears that the function AsIs [I(x)] would enable me to > accomplish this without the write/read steps. However, it is not > obvious to me how to implement I(x). Can anyone advise?I don't think that is going to help.... There are really several issues here: Your numeric column was converted to character by the cbind, using as.data.frame(I(MATRIX)) will not split it into individual columns, and things like apply(MATRIX,2,f) may do the right thing to begin with, but then there's coercion due to an implicit cbind at the end. It's a bit awkward, but this may do it:> FRAME <- as.data.frame(lapply(split(MATRIX,col(MATRIX)),type.convert)) > names(FRAME) <- colnames(MATRIX) > str(FRAME)`data.frame': 4 obs. of 2 variables: $ TEXT : Factor w/ 4 levels "Text1","Text2",..: 1 2 3 4 $ NUMBERS: int 14 13 12 11 whereas this isn't right:> str(apply(MATRIX,2,type.convert))int [1:4, 1:2] 1 2 3 4 14 13 12 11 - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:2] "TEXT" "NUMBERS" -- O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907