Jason Vertrees
2009-May-29 21:33 UTC
[Rd] Why change data type when dropping to one-dimension?
Hello, First, let me say I'm an avid fan of R--it's incredibly powerful and I use it all the time. I appreciate all the hard work that the many developers have undergone. My question is: why does the paradigm of changing the type of a 1D return value to an unlisted array exist? This introduces boundary conditions where none need exist, thus making the coding harder and confusing. For example, consider: > d = data.frame(a=rnorm(10), b=rnorm(10)); > typeof(d); # OK; > typeof(d[,1]); # Unexpected; > typeof(d[,1,drop=F]); # Oh, now I see. This is indeed documented in the R Language specification, but why is it there in the first place? It doesn't make sense to the average programmer to change the return type based on dimension. Here it is again in 'sapply': > sapply > function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE) > { > [...snip...] > if (common.len == 1) > unlist(answer, recursive = FALSE) > else if (common.len > 1) > array(unlist(answer, recursive = FALSE), > dim = c(common.len, > length(X)), dimnames = if (!(is.null(n1 <- > names(answer[[1]])) & > is.null(n2 <- names(answer)))) > list(n1, n2)) > [...snip...] > } So, in 'sapply', if your return value is one-dimensional be careful, because the return type will not the be same as if it were otherwise. Is this legacy or a valid, rational design decision which I'm not yet a sophisticated enough R coder to enjoy? Thanks, -- Jason -- Jason Vertrees, PhD Dartmouth College : jv at cs.dartmouth.edu Boston University : jasonv at bu.edu PyMOLWiki : http://www.pymolwiki.org/
Thomas Lumley
2009-May-29 21:54 UTC
[Rd] Why change data type when dropping to one-dimension?
On Fri, 29 May 2009, Jason Vertrees wrote:> My question is: why does the paradigm of changing the type of a 1D > return value to an unlisted array exist? This introduces boundary > conditions where none need exist, thus making the coding harder and > confusing. > > For example, consider: > > d = data.frame(a=rnorm(10), b=rnorm(10)); > > typeof(d); # OK; > > typeof(d[,1]); # Unexpected; > > typeof(d[,1,drop=F]); # Oh, now I see.It does make it harder for programmers, but it makes it easier for non-programmers. In particular, it is convenient to be able to do d[1,1] to extract a number from a matrix, rather than having to explicitly coerce the result to stop it being a matrix. At least the last two times this was discussed, there ended up being a reasonable level of agreement that if someone's life had to be made harder the programmers were better able to cope and that dropping dimensions was preferable. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Stavros Macrakis
2009-May-29 22:17 UTC
[Rd] Why change data type when dropping to one-dimension?
This is another example of the general preference of the designers of R for convenience over consistency. In my opinion, this is a design flaw even for non-programmers, because I find that inconsistencies make the system harder to learn. Yes, the naive user may stumble over the difference between m[[1,1]] and m[1,1] a few times before getting it, but once he or she understands the principle, it is general. -s On Fri, May 29, 2009 at 5:33 PM, Jason Vertrees <jv@cs.dartmouth.edu> wrote:> Hello, > > First, let me say I'm an avid fan of R--it's incredibly powerful and I > use it all the time. I appreciate all the hard work that the many > developers have undergone. > > My question is: why does the paradigm of changing the type of a 1D > return value to an unlisted array exist? This introduces boundary > conditions where none need exist, thus making the coding harder and > confusing. > > For example, consider: > > d = data.frame(a=rnorm(10), b=rnorm(10)); > > typeof(d); # OK; > > typeof(d[,1]); # Unexpected; > > typeof(d[,1,drop=F]); # Oh, now I see. > > This is indeed documented in the R Language specification, but why is it > there in the first place? It doesn't make sense to the average > programmer to change the return type based on dimension. > > Here it is again in 'sapply': > > sapply > > function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE) > > { > > [...snip...] > > if (common.len == 1) > > unlist(answer, recursive = FALSE) > > else if (common.len > 1) > > array(unlist(answer, recursive = FALSE), > > dim = c(common.len, > > length(X)), dimnames = if (!(is.null(n1 <- > > names(answer[[1]])) & > > is.null(n2 <- names(answer)))) > > list(n1, n2)) > > [...snip...] > > } > > So, in 'sapply', if your return value is one-dimensional be careful, > because the return type will not the be same as if it were otherwise. > > Is this legacy or a valid, rational design decision which I'm not yet a > sophisticated enough R coder to enjoy? > > Thanks, > > -- Jason > > -- > > Jason Vertrees, PhD > > Dartmouth College : jv@cs.dartmouth.edu > Boston University : jasonv@bu.edu > > PyMOLWiki : http://www.pymolwiki.org/ > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Wacek Kusnierczyk
2009-May-29 22:25 UTC
[Rd] Why change data type when dropping to one-dimension?
Stavros Macrakis wrote:> This is another example of the general preference of the designers of R for > convenience over consistency. > > In my opinion, this is a design flaw even for non-programmers, because I > find that inconsistencies make the system harder to learn. Yes, the naive > user may stumble over the difference between m[[1,1]] and m[1,1] a few times > before getting it, but once he or she understands the principle, it is > general. >+1 vQ