Magnus Thor Torfason
2010-Sep-15 22:10 UTC
[R] Inconvenient behavior of as.data.frame() for lists without names
Hi all, I ran into a small issue when converting a list of vectors to a data frame. The Issue I'm having is described by the snippet below: ######################################################### # Convert a list of vectors into a data.frame strlen = 256 s.long.a = paste( letters[1+(0:strlen %% 26)], collapse="") s.long.b = paste( letters[1+(strlen:0 %% 26)], collapse="") v.long.a = rep(s.long.a, 2) v.long.b = rep(s.long.b, 2) # Convert when the list has no names for its elements my.list = list(v.long.a, v.long.b) my.df = as.data.frame(my.list) # Here we get an error my.df # This solves the problem names(my.list) = c("a","b") my.fixed.df = as.data.frame(my.list) my.fixed.df ######################################################### In short, the problem is that when there are no names attached to the elements of the list, it creates very long names - if the elements of the vectors themselves are long. And further, that names that are in some since disallowed (can't be printed, for one), are silently injected into a data.frame, leading to an error later on. Better would be to error out in as.data.frame Best would be if way of generating default names in this function would be intelligent enough to never create names longer than - say 30 characters. Of course, explicit names should be honored. Anyway, that's my thoughts on this issue. No patch attached, and I will work around this, but at least it is out there now. Best, Magnus Thor
David Winsemius
2010-Sep-16 01:07 UTC
[R] Inconvenient behavior of as.data.frame() for lists without names
On Sep 15, 2010, at 6:10 PM, Magnus Thor Torfason wrote:> Hi all, > > I ran into a small issue when converting a list of vectors to a data > frame. The Issue I'm having is described by the snippet below: > > ######################################################### > # Convert a list of vectors into a data.frame > strlen = 256 > s.long.a = paste( letters[1+(0:strlen %% 26)], collapse="") > s.long.b = paste( letters[1+(strlen:0 %% 26)], collapse="") > v.long.a = rep(s.long.a, 2) > v.long.b = rep(s.long.b, 2) > > # Convert when the list has no names for its elements > my.list = list(v.long.a, v.long.b) > my.df = as.data.frame(my.list) > > # Here we get an error > my.df >I have also been annoyed at that behavior. I can make the problem go away by shortening the assignment of names to nameless lists which occurs about halfway through the code of the data.frame function: . . else if (no.vn[[i]]) { tmpname <- substr(deparse(object[[i]])[1L], 1, 10) # the base data.frame fn does not use the substring shortening if (substr(tmpname, 1L, 2L) == "I(") { ntmpn <- nchar(tmpname, "c") if (substr(tmpname, ntmpn, ntmpn) == ")") tmpname <- substr(tmpname, 3L, max(ntmpn, 20) - 1L) } vnames[[i]] <- tmpname . . No error and the names are c..abcdefg and c..wvutsrq. Whether you want to muck with the code of data.frame, well, it's your machine and if it breaks, the standard warranty applies, ..... you get to keep both pieces. -- David.> # This solves the problem > names(my.list) = c("a","b") > my.fixed.df = as.data.frame(my.list) > my.fixed.df > ######################################################### > > In short, the problem is that when there are no names attached to > the elements of the list, it creates very long names - if the > elements of the vectors themselves are long. And further, that names > that are in some since disallowed (can't be printed, for one), are > silently injected into a data.frame, leading to an error later on. > > Better would be to error out in as.data.frame > > Best would be if way of generating default names in this function > would be intelligent enough to never create names longer than - say > 30 characters. Of course, explicit names should be honored. > > Anyway, that's my thoughts on this issue. No patch attached, and I > will work around this, but at least it is out there now. > > Best, > Magnus Thor > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.