Rolf Turner
2019-Feb-05 22:22 UTC
[R] data.frame() versus as.data.frame() applied to a matrix.
Consider the following: set.seed(42) X <- matrix(runif(40),10,4) colnames(X) <- c("a","b","a:x","b:x") # Imitating the output # of model.matrix(). D1 <- as.data.frame(X) D2 <- data.frame(X) names(D1) [1] "a" "b" "a:x" "b:x" names(D2) [1] "a" "b" "a.x" "b.x" The names of D2 are syntactically valid; those of D1 are not. Why should I have expected this phenomenon? :-) The as.data.frame() syntax seems to me much more natural for converting a matrix to a data frame, yet it doesn't get it quite right, sometimes, in respect of the names. Is there some reason that as.data.frame() does not apply make.names()? Or was this just an oversight? cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
Jeff Newmiller
2019-Feb-05 23:27 UTC
[R] data.frame() versus as.data.frame() applied to a matrix.
I have no idea about "why it is this way" but there are many cases where I would rather have to use backticks around syntactically-invalid names than deal with arbitrary rules for mapping column names as they were supplied to column names as R wants them to be. From that perspective, making the conversion function leave the names alone and limit the name-mashing to one function sounds great to me. You can always call make.names yourself. On February 5, 2019 2:22:24 PM PST, Rolf Turner <r.turner at auckland.ac.nz> wrote:> >Consider the following: > >set.seed(42) >X <- matrix(runif(40),10,4) >colnames(X) <- c("a","b","a:x","b:x") # Imitating the output > # of model.matrix(). >D1 <- as.data.frame(X) >D2 <- data.frame(X) >names(D1) >[1] "a" "b" "a:x" "b:x" >names(D2) >[1] "a" "b" "a.x" "b.x" > >The names of D2 are syntactically valid; those of D1 are not. > >Why should I have expected this phenomenon? :-) > >The as.data.frame() syntax seems to me much more natural for converting > >a matrix to a data frame, yet it doesn't get it quite right, sometimes, >in respect of the names. > >Is there some reason that as.data.frame() does not apply make.names()? >Or was this just an oversight? > >cheers, > >Rolf Turner-- Sent from my phone. Please excuse my brevity.
Rolf Turner
2019-Feb-05 23:52 UTC
[R] data.frame() versus as.data.frame() applied to a matrix.
On 2/6/19 12:27 PM, Jeff Newmiller wrote:> I have no idea about "why it is this way" but there are many cases > where I would rather have to use backticks around > syntactically-invalid names than deal with arbitrary rules for > mapping column names as they were supplied to column names as R wants > them to be. From that perspective, making the conversion function > leave the names alone and limit the name-mashing to one function > sounds great to me. You can always call make.names yourself.Fair enough. My real problem was getting ambushed by the fact that *different* names arise depending on whether one uses data.frame(X) or as.data.frame(X). I'll spare you the details. :-) cheers, Rolf> > On February 5, 2019 2:22:24 PM PST, Rolf Turner > <r.turner at auckland.ac.nz> wrote: >> >> Consider the following: >> >> set.seed(42) X <- matrix(runif(40),10,4) colnames(X) <- >> c("a","b","a:x","b:x") # Imitating the output # of model.matrix(). >> D1 <- as.data.frame(X) D2 <- data.frame(X) names(D1) [1] "a" "b" >> "a:x" "b:x" names(D2) [1] "a" "b" "a.x" "b.x" >> >> The names of D2 are syntactically valid; those of D1 are not. >> >> Why should I have expected this phenomenon? :-) >> >> The as.data.frame() syntax seems to me much more natural for >> converting >> >> a matrix to a data frame, yet it doesn't get it quite right, >> sometimes, in respect of the names. >> >> Is there some reason that as.data.frame() does not apply >> make.names()? Or was this just an oversight?
William Dunlap
2019-Feb-06 01:41 UTC
[R] data.frame() versus as.data.frame() applied to a matrix.
I think of the methods of as.data.frame as a helper functions for data.frame and don't usually call as.data.frame directly. data.frame() will call as.data.frame for each of its arguments and then put together the the results into one big data.frame.> for(method inc("as.data.frame.list","as.data.frame.character","as.data.frame.integer","as.data.frame.numeric","as.data.frame.matrix")) trace(method, quote(str(x))) Tracing function "as.data.frame.list" in package "base" Tracing function "as.data.frame.character" in package "base" Tracing function "as.data.frame.integer" in package "base" Tracing function "as.data.frame.numeric" in package "base" Tracing function "as.data.frame.matrix" in package "base"> d <-data.frame(Mat=cbind(m1=11:12,M2=13:14),Num=c(15.5,16.6),Int=17:18,List=list(L1=19:20,L2=c(20.2,21.2))) Tracing as.data.frame.matrix(x[[i]], optional = TRUE) on entry int [1:2, 1:2] 11 12 13 14 - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:2] "m1" "M2" Tracing as.data.frame.numeric(x[[i]], optional = TRUE) on entry num [1:2] 15.5 16.6 Tracing as.data.frame.integer(x[[i]], optional = TRUE) on entry int [1:2] 17 18 Tracing as.data.frame.list(x[[i]], optional = TRUE, stringsAsFactors stringsAsFactors) on entry List of 2 $ L1: int [1:2] 19 20 $ L2: num [1:2] 20.2 21.2 Tracing as.data.frame.integer(x[[i]], optional = TRUE) on entry int [1:2] 19 20 Tracing as.data.frame.numeric(x[[i]], optional = TRUE) on entry num [1:2] 20.2 21.2 If I recall correctly, that is how S did things and Splus tried to use something like as.data.frameAux for the name of the helper function to avoid some of the frustration you describe. Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Feb 5, 2019 at 2:22 PM Rolf Turner <r.turner at auckland.ac.nz> wrote:> > Consider the following: > > set.seed(42) > X <- matrix(runif(40),10,4) > colnames(X) <- c("a","b","a:x","b:x") # Imitating the output > # of model.matrix(). > D1 <- as.data.frame(X) > D2 <- data.frame(X) > names(D1) > [1] "a" "b" "a:x" "b:x" > names(D2) > [1] "a" "b" "a.x" "b.x" > > The names of D2 are syntactically valid; those of D1 are not. > > Why should I have expected this phenomenon? :-) > > The as.data.frame() syntax seems to me much more natural for converting > a matrix to a data frame, yet it doesn't get it quite right, sometimes, > in respect of the names. > > Is there some reason that as.data.frame() does not apply make.names()? > Or was this just an oversight? > > cheers, > > Rolf Turner > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]