## Thomas rightly points out that list() is not the best structure for ## homogeneous data. My example was the simplest that generated the ## error of a matrix structure that that doesn't work. The application ## that this is simplified from needs lists because the data isn't ## homogeneous. I am attempting to write a missing value class, where ## each item is a list. In the simplest instance, if the datum is missing ## then the attribute contains the reason. ## ## Both of the 'bugs'/'unimplemented features' ## 3. a <- matrix(list(1,2,3,4,5,6), 2, 3) ## 6. bug in "[" for lists ## show up in this example. Thus x <- list(1,2,3,4,5,6) dim(x) <- c(2,3) x[[2,3]] <- NA attr(x[[2,3]],"mv") <- "absent" x for(i in x) print(i) x[2,] x[1,1] x[2,3] Rich -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, Apr 23, 2001 at 01:57:32PM -0400, Rich Heiberger wrote:> ## Thomas rightly points out that list() is not the best structure for > ## homogeneous data. My example was the simplest that generated the > ## error of a matrix structure that that doesn't work. The application > ## that this is simplified from needs lists because the data isn'tdata aren't (?)> ## homogeneous. I am attempting to write a missing value class, where > ## each item is a list. In the simplest instance, if the datum is missing > ## then the attribute contains the reason. > ## > ## Both of the 'bugs'/'unimplemented features' > ## 3. a <- matrix(list(1,2,3,4,5,6), 2, 3) > ## 6. bug in "[" for lists > ## show up in this example. Thus > > x <- list(1,2,3,4,5,6) > dim(x) <- c(2,3) > x[[2,3]] <- NA > attr(x[[2,3]],"mv") <- "absent" > x > for(i in x) print(i) > x[2,] > x[1,1] > x[2,3]Yes, it appears to be either a bug or perhaps an unimplemented feature (or possibly an unimplemented bug). The problem with the S version of this is that the following doesn't work the same as the above (I might argue that it should)> x<-matrix(1:6,nc=3) > x[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6> x[[2,3]]<-NA > x[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 NA> attr(x[[2,3]],"mv") <- "absent" > for(i in x) print(i)[1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] NA I am uncertain as to why you want to put dimensions on a list? It doesn't make it a matrix. That said we can duplicate the behaviour but I'm not sure that it is worth introducing the confusion that ought to follow. For my favorite silly example:> dim(mean)<-c(2,2)Warning messages: assigning "mean" masks an object of the same name on database 2> mean[,1] [,2] [1,] missing, 0 [2,] 0 {, 5 attr(, "names"): [1] "x" "trim" "na.rm" ""> is.matrix(mean)[1] T> mean(1:10)[1] 5.5>Just because you can does not necessarily mean that you should.> > Rich > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- +---------------------------------------------------------------------------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B28 | Harvard School of Public Health email: rgentlem@jimmy.dfci.harvard.edu | +---------------------------------------------------------------------------+ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Robert's example x<-matrix(1:6,nc=3) x[[2,3]]<-NA attr(x[[2,3]],"mv") <- "absent" loses the attribute in S-Plus as well as in R. The only way I have been able to put attributes on individual elements is to make a matrix of lists. Interestingly, this statement does print print(attr(x[[2,3]],"mv") <- "absent") The value of the attribute doesn't vanish for the duration of the current statement. The value is gone if we follow this with attr(x[[2,3]],"mv") Should there be a warning message when the assignment is attempted? As to the fundamental question of putting dimensions on a list, that is asking the question backwards from how I am looking at it. I have a two-dimensional array of data items for which one or more of the values is unknown. The matrix is the natural structure for this data. How do I record the type of missingness for the missing datum? I have chosen to place a list in the cell of the matrix. Another option is to have a parallel matrix of missingness information, possibly attached as an attribute to the original data. Another option is a sparse array of some form that is pointed to by access functions that are sensitive to the NA in the original data matrix. To me, the recursive structure is the most natural way to represent the information about missingness. Rich -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
[Rd] several bugs (PR#918) lists and matrices Thank you Thomas for suggesting I review the proceedings paper on data by Rob. I liked that paper when I heard it and was happy to reread it. I agree with everything he said there. After reading it I went back to the Blue Book and have several comments. Thomas said You can't put a list into a matrix. Matrices handle homogenous data; they are vectors with a dimension attribute. Lists with an arbitrary dimension attribute are, as Rob pointed out, an unimplemented bug. However, rectangle things with arbitrary data in them do exist. They're called data frames. Thomas's statement is a clear change in perspective from the original Blud Book interpretation. The blue book position and my starting position is that "All objects are vectors." Objects can be atomic or recursive, and it is very clear in the text that subscripting is applied to all vectors, no matter the complexity. The description of dim(x) on page 438 is very clear: x is any object. The descriptions of array() (page 382) and matrix() (page 504) are equally clear: "the array class of objects are those that have an attribute dim, ...." John's recent file http://www.omegahat.org/SLanguage/Aspects.html repeats this generality with The original history of S (if you're interested see the note) led to a vector-oriented approach to all objects; that is, all objects were vectors (one-way arrays). Lists and other recursive objects were special only in that the elements of the vector were themselves objects. It is less clear in the R help, but still indicated in the phrase "Retrieve or set the dimension of an OBJECT." in the description of dim(). The R help for array() and matrix() both say "data: a vector giving data to fill the array" and the help for vector() clearly permits vectors of mode list. Nowhere do I see a claim that matrices or arrays must contain only homogeneous atomic objects. R currently (rw1021) believes a dimensioned list is a matrix (but not a vector). > x <- list(1,2,3,4,5,6,7,8,9,10) > is.vector(x) [1] TRUE > dim(x) <- c(2,5) > x [,1] [,2] [,3] [,4] [,5] [1,] "Numeric,1" "Numeric,1" "Numeric,1" "Numeric,1" "Numeric,1" [2,] "Numeric,1" "Numeric,1" "Numeric,1" "Numeric,1" "Numeric,1" > is.matrix(x) [1] TRUE > is.vector(x) [1] FALSE A data frame is essentially a list of arbitrarily structured column vectors. The recognition that Rob has in this paper is that the individual columns can themselves contain arbitrarily structured data. He is doing marvelous things with those data frame columns. Everything he does will work on dimensioned lists and will avoid the the two serious difficulties with data frames that he notes. a. The whole issue of character strings becoming factors and the non-nonintuitive I() and AsIs is a consequence of using data frames. This issue vanishes when dimensioned lists are used directly. b. Parameterized data frames can be replaced by a new data structure in which a "parameter" attribute holds the relevant information. My attempt to construct a "missing.value" class would work with a data frame structure. I don't think it is the best structure. In summary, I am arguing for permitting a <- matrix(list(1,2,3,4,5,6), 2, 3) along with the currently acceptable a <- list(1,2,3,4,5,6) dim(a) <- c(2, 3) and for making the subscripting a[,1:2] provide the correct answer [,1] [,2] [1,] 1 3 [2,] 2 4 rather than the current incorrect answer [,1] [,2] [1,] "NULL" "NULL" [2,] "NULL" "NULL" -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 30 Apr 2001, Rich Heiberger wrote:> [Rd] several bugs (PR#918) lists and matrices > > Thank you Thomas for suggesting I review the proceedings paper on data > by Rob. I liked that paper when I heard it and was happy to reread > it. I agree with everything he said there. After reading it I went > back to the Blue Book and have several comments.> > In summary, I am arguing for permitting > a <- matrix(list(1,2,3,4,5,6), 2, 3) > along with the currently acceptable > a <- list(1,2,3,4,5,6) > dim(a) <- c(2, 3) > and for making the subscripting > a[,1:2] > provide the correct answer > [,1] [,2] > [1,] 1 3 > [2,] 2 4 > rather than the current incorrect answer > [,1] [,2] > [1,] "NULL" "NULL" > [2,] "NULL" "NULL"Yes, it probably should work. I still prefer data frames, but I was clearly wrong about dimensioned lists. -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._