Satoshi Takahama
2006-May-16 00:57 UTC
[R] retaining character matrices when combining into data frames
Hello everyone, If I want to convert or combine a (large) character matrix into a data frame without having any of its columns convert into a factor class, is there a simple solution? I() says it will operate on 'an object' but it seems that unless the object is a vector, the results are not what I expect. For instance, if g is a 2x2 character matrix, as.data.frame(I(g)) will return an object of the data frame class but not structured in the way I intended. What I hope to retrieve is the result of h = data.frame(I(g[,1]),I(g[,2])) names(h) = dimnames(g)[[2]] With data sets of 100+ columns, wrapping I() around each column can be very time-consuming when invoking the functions data.frame(), cbind(), or as.data.frame(). R used to offer the optional argument, as.is=TRUE (and S-PLUS offered stringsAsFactors=FALSE), to accomplish what this task but it seems that this argument was removed some time ago. Is there a more attractive alternative available now? Thanks very much, Satoshi __________ Satoshi Takahama Scripps Institution of Oceanography Center for Atmospheric Science 9500 Gilman Drive, Dept. 0221 La Jolla, CA 92093 858-531-5328
Gabor Grothendieck
2006-May-16 01:59 UTC
[R] retaining character matrices when combining into data frames
Try this: # test data mat <- matrix(letters, 2) # convert to data frame with character columns DF <- replace(as.data.frame(mat),,mat) On 5/15/06, Satoshi Takahama <stakahama at ucsd.edu> wrote:> Hello everyone, > > If I want to convert or combine a (large) character matrix into a data > frame without having any of its columns convert into a factor class, > is there a simple solution? I() says it will operate on 'an > object' but it seems that unless the object is a vector, the results > are not what I expect. > > For instance, if g is a 2x2 character matrix, as.data.frame(I(g)) will > return an object of the data frame class but not structured in the way > I intended. What I hope to retrieve is the result of > > h = data.frame(I(g[,1]),I(g[,2])) > names(h) = dimnames(g)[[2]] > > With data sets of 100+ columns, wrapping I() around each column can be > very time-consuming when invoking the functions data.frame(), cbind(), > or as.data.frame(). R used to offer the optional argument, as.is=TRUE > (and S-PLUS offered stringsAsFactors=FALSE), to accomplish what this > task but it seems that this argument was removed some time ago. Is > there a more attractive alternative available now? > > Thanks very much, > > Satoshi > > > __________ > Satoshi Takahama > Scripps Institution of Oceanography > Center for Atmospheric Science > 9500 Gilman Drive, Dept. 0221 > La Jolla, CA 92093 > 858-531-5328 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! R-project.org/posting-guide.html >
Prof Brian Ripley
2006-May-16 06:35 UTC
[R] retaining character matrices when combining into data frames
On Mon, 15 May 2006, Satoshi Takahama wrote:> Hello everyone, > > If I want to convert or combine a (large) character matrix into a data > frame without having any of its columns convert into a factor class, > is there a simple solution? I() says it will operate on 'an > object' but it seems that unless the object is a vector, the results > are not what I expect. > > For instance, if g is a 2x2 character matrix, as.data.frame(I(g)) will > return an object of the data frame class but not structured in the way > I intended.Which is? Your subject line says you want to retain a character matrix, and that is exactly what happens:> g <- matrix(letters[1:4], 2, 2) > as.data.frame(I(g))x.1 x.2 1 a c 2 b d> What I hope to retrieve is the result of > > h = data.frame(I(g[,1]),I(g[,2])) > names(h) = dimnames(g)[[2]]So it seems you don't actually want to `retain character matrices'.> With data sets of 100+ columns, wrapping I() around each column can be > very time-consuming when invoking the functions data.frame(), cbind(), > or as.data.frame().Really? How are you trying to do it?> R used to offer the optional argument, as.is=TRUE > (and S-PLUS offered stringsAsFactors=FALSE), to accomplish what this > task but it seems that this argument was removed some time ago.It was not there in R 1.0.0, so you are talking about alpha/beta versions. I think it existed only when data.frame was a rather different class.> Is there a more attractive alternative available now?I guess what you want is for each column of a character matrix to be inserted as a character vector. There are several ways: a simple one is DF <- data.frame(g) for(i in 1:ncol(g)) DF[[i]] <- g[, i] -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, stats.ox.ac.uk/~ripley University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595