peter dalgaard
2019-Oct-31 22:04 UTC
[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
Hmm, the problem I see here is that these implied classes are all inherently one-off. We also have> inherits(matrix(1,1,1),"numeric")[1] FALSE> is.numeric(matrix(1,1,1))[1] TRUE> inherits(1L,"numeric")[1] FALSE> is.numeric(1L)[1] TRUE and if we start fixing one, we might need to fix all. For method dispatch, we do have inheritance, e.g.> foo.numeric <- function(x) x + 1 > foo <- function(x) UseMethod("foo") > foo(1)[1] 2> foo(1L)[1] 2> foo(matrix(1,1,1))[,1] [1,] 2> foo.integer <- function(x) x + 2 > foo(1)[1] 2> foo(1L)[1] 3> foo(matrix(1,1,1))[,1] [1,] 2> foo(matrix(1L,1,1))[,1] [1,] 3 but these are not all automatic: "integer" implies "numeric", but "matrix" does not imply "numeric", much less "integer". Also, we seem to have a rule that inherits(x, c) iff c %in% class(x), which would break -- unless we change class(x) to return the whole set of inherited classes, which I sense that we'd rather not do.... -pd> On 30 Oct 2019, at 12:29 , Martin Maechler <maechler at stat.math.ethz.ch> wrote: > > Note however the following historical quirk : > >> sapply(setNames(,1:5), function(K) inherits(array(pi, dim=1:K), "array")) > 1 2 3 4 5 > TRUE FALSE TRUE TRUE TRUE > > (Is this something we should consider changing for R 4.0.0 -- to > have it TRUE also for 2d-arrays aka matrix objects ??)-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Martin Maechler
2019-Nov-01 08:52 UTC
[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
>>>>> peter dalgaard >>>>> on Thu, 31 Oct 2019 23:04:29 +0100 writes:> Hmm, the problem I see here is that these implied classes are all inherently one-off. We also have >> inherits(matrix(1,1,1),"numeric") > [1] FALSE >> is.numeric(matrix(1,1,1)) > [1] TRUE >> inherits(1L,"numeric") > [1] FALSE >> is.numeric(1L) > [1] TRUE > and if we start fixing one, we might need to fix all. I disagree about "fixing all" (see also my reply to Herv?), and the {"numeric","double","integer"} case is particularly messy, and I don't want to open that can now. > For method dispatch, we do have inheritance, e.g. >> foo.numeric <- function(x) x + 1 >> foo <- function(x) UseMethod("foo") >> foo(1) > [1] 2 >> foo(1L) > [1] 2 >> foo(matrix(1,1,1)) > [,1] > [1,] 2 >> foo.integer <- function(x) x + 2 >> foo(1) > [1] 2 >> foo(1L) > [1] 3 >> foo(matrix(1,1,1)) > [,1] > [1,] 2 >> foo(matrix(1L,1,1)) > [,1] > [1,] 3 > but these are not all automatic: "integer" implies "numeric", but "matrix" does not imply "numeric", much less "integer". well it should not imply in general: Contrary to Math, we also have 'raw' or 'character' or 'logical' matrices. > Also, we seem to have a rule that inherits(x, c) iff c %in% class(x), good point, and that's why my usage of inherits(.,.) was not quite to the point. [OTOH, it was to the point, as indeed from the ?class / ?inherits docu, S3 method dispatch and inherits must be consistent ] > which would break -- unless we change class(x) to return the whole set of inherited classes, which I sense that we'd rather not do.... and we have something like that already with is(.) Thank you for these important points raised! Note again that both "matrix" and "array" are special [see ?class] as being of __implicit class__ and I am considering that this implicit class behavior for these two should be slightly changed such that foo <- function(x,...) UseMethod("foo") foo.array <- function(x, ...) sprintf("array of dim. %s", paste(dim(x), collapse = " x ")) should work for all arrays and not be an exception for 2D arrays :> foo(array(pi, 1:3))[1] "array of dim. 1 x 2 x 3"> foo(array(pi, 1))[1] "array of dim. 1"> foo(array(pi, 2:7))[1] "array of dim. 2 x 3 x 4 x 5 x 6 x 7"> foo(array(pi, 1:2))Error in UseMethod("foo") : no applicable method for 'foo' applied to an object of class "c('matrix', 'double', 'numeric')">And indeed I think you are right on spot and this would mean that indeed the implicit class "matrix" should rather become c("matrix", "array"). BTW: The 'Details' section of ?class nicely defines things, notably the __implicit class__ situation (but I think should be improved) : {numbering the paragraphs for reference}> Details: > > 1. Here, we describe the so called ?S3? classes (and methods). For > ?S4? classes (and methods), see ?Formal classes? below. > > 2. Many R objects have a class attribute, a character vector giving > the names of the classes from which the object _inherits_. > (Functions oldClass and oldClass<- get and set the attribute, > which can also be done directly.) > > 3. If the object does not have a class attribute, it has an implicit > class, notably ?"matrix"?, ?"array"?, ?"function"? or ?"numeric"? > or the result of ?typeof(x)? (which is similar to ?mode(x)?), but > for type ?"language"? and mode ?"call"?, where the following > extra classes exist for the corresponding function calls: if, > while, for, =, <-, (, {, call.So, I think clearly { for S3, not S4 ! } "class attribute" := attr(x, "class") "implicit class" := the class(x) of R objects that do *not* have a class attribute> 4. Note that NULL objects cannot have attributes (hence not > classes) and attempting to assign a class is an error.the above has one small flaw : "(hence not classes)" is not correct. Of course class(NULL) is "NULL" by par. 3's typeof(x) "rule".> 5a. When a generic function ?fun? is applied to an object with class > attribute ?c("first", "second")?, the system searches for a > function called ?fun.first? and, if it finds it, applies it to the > object. If no such function is found, a function called > ?fun.second? is tried. If no class name produces a suitable > function, the function ?fun.default? is used (if it exists). > 5b. If there is no class attribute, the implicit class is tried, then the > default method.> 6. The function 'class' prints the vector of names of classes an > object inherits from. Correspondingly, class<- sets the classes > an object inherits from. Assigning NULL removes the class > attribute.["of course", the word "prints" above should be replaced by "returns" ! ]> 7. 'unclass' returns (a copy of) its argument with its class > attribute removed. (It is not allowed for objects which cannot be > copied, namely environments and external pointers.)> 8. 'inherits' indicates whether its first argument inherits from any > of the classes specified in the ?what? argument. If which is > TRUE then an integer vector of the same length as ?what? is > returned. Each element indicates the position in the ?class(x)? > matched by the element of ?what?; zero indicates no match. If > which is FALSE then TRUE is returned by inherits if any of > the names in ?what? match with any class.{I had forgotten that the 2nd argument of inherits, 'what', can be a vector and about the 'which' argument} >> On 30 Oct 2019, at 12:29 , Martin Maechler <maechler at stat.math.ethz.ch> wrote: >> >> Note however the following historical quirk : >> >>> sapply(setNames(,1:5), function(K) inherits(array(pi, dim=1:K), "array")) >> 1 2 3 4 5 >> TRUE FALSE TRUE TRUE TRUE >> >> (Is this something we should consider changing for R 4.0.0 -- to >> have it TRUE also for 2d-arrays aka matrix objects ??) > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Gabriel Becker
2019-Nov-02 19:37 UTC
[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
Thanks Martin and Peter, I agree that we can be careful and narrow and still see a nice improvement in behavior. While Herve's point is valid and I understand his frustration, I think staying within the matrix vs c(matrix, array) space is the right scope for this work in terms of fiddling with inheritance. As another point, I don't know off the top of my head of any other classes which we would expect to have a dimensions attribute other than arrays (including the "non-array" 2d matrices) and data.frames, but I imagine there are some out there. Do we want the default head and tail methods to be dimension aware as well, via something along the lines of what I had in my previous message, or do we want to retain the old behavior for things that aren't data.frames or matrix/arrays? If the dim attribute can always be assumed to mean the same thing I feel like it would be nice to give the dimensionality awareness (and idempotence) to anything with dimensions, but again I don't know much about the other classes taht have that attribute or how people want to use them. It would of course be written in a way that still worked identically to now for any object that does not have a dimension attribute. Thoughts? ~G On Fri, Nov 1, 2019 at 1:52 AM Martin Maechler <maechler at stat.math.ethz.ch> wrote:> >>>>> peter dalgaard > >>>>> on Thu, 31 Oct 2019 23:04:29 +0100 writes: > > > Hmm, the problem I see here is that these implied classes are all > inherently one-off. We also have > >> inherits(matrix(1,1,1),"numeric") > > [1] FALSE > >> is.numeric(matrix(1,1,1)) > > [1] TRUE > >> inherits(1L,"numeric") > > [1] FALSE > >> is.numeric(1L) > > [1] TRUE > > > and if we start fixing one, we might need to fix all. > > I disagree about "fixing all" (see also my reply to Herv?), and > the {"numeric","double","integer"} case is particularly messy, > and I don't want to open that can now. > > > For method dispatch, we do have inheritance, e.g. > > >> foo.numeric <- function(x) x + 1 > >> foo <- function(x) UseMethod("foo") > >> foo(1) > > [1] 2 > >> foo(1L) > > [1] 2 > >> foo(matrix(1,1,1)) > > [,1] > > [1,] 2 > >> foo.integer <- function(x) x + 2 > >> foo(1) > > [1] 2 > >> foo(1L) > > [1] 3 > >> foo(matrix(1,1,1)) > > [,1] > > [1,] 2 > >> foo(matrix(1L,1,1)) > > [,1] > > [1,] 3 > > > but these are not all automatic: "integer" implies "numeric", but > "matrix" does not imply "numeric", much less "integer". > > well it should not imply in general: > Contrary to Math, we also have 'raw' or 'character' or 'logical' matrices. > > > > Also, we seem to have a rule that inherits(x, c) iff c %in% > class(x), > > good point, and that's why my usage of inherits(.,.) was not > quite to the point. [OTOH, it was to the point, as indeed from > the ?class / ?inherits docu, S3 method dispatch and inherits > must be consistent ] > > > which would break -- unless we change class(x) to return the whole > set of inherited classes, which I sense that we'd rather not do.... > > and we have something like that already with is(.) > > Thank you for these important points raised! > > Note again that both "matrix" and "array" are special [see ?class] as > being of __implicit class__ and I am considering that this > implicit class behavior for these two should be slightly changed > such that > > foo <- function(x,...) UseMethod("foo") > foo.array <- function(x, ...) > sprintf("array of dim. %s", paste(dim(x), collapse = " x ")) > > should work for all arrays and not be an exception for 2D arrays : > > > foo(array(pi, 1:3)) > [1] "array of dim. 1 x 2 x 3" > > foo(array(pi, 1)) > [1] "array of dim. 1" > > foo(array(pi, 2:7)) > [1] "array of dim. 2 x 3 x 4 x 5 x 6 x 7" > > foo(array(pi, 1:2)) > Error in UseMethod("foo") : > no applicable method for 'foo' applied to an object of class > "c('matrix', 'double', 'numeric')" > > > > And indeed I think you are right on spot and this would mean > that indeed the implicit class > "matrix" should rather become c("matrix", "array"). > > BTW: The 'Details' section of ?class nicely defines things, > notably the __implicit class__ situation > (but I think should be improved) : > > {numbering the paragraphs for reference} > > > Details: > > > > 1. Here, we describe the so called ?S3? classes (and methods). For > > ?S4? classes (and methods), see ?Formal classes? below. > > > > 2. Many R objects have a class attribute, a character vector giving > > the names of the classes from which the object _inherits_. > > (Functions oldClass and oldClass<- get and set the attribute, > > which can also be done directly.) > > > > 3. If the object does not have a class attribute, it has an implicit > > class, notably ?"matrix"?, ?"array"?, ?"function"? or ?"numeric"? > > or the result of ?typeof(x)? (which is similar to ?mode(x)?), but > > for type ?"language"? and mode ?"call"?, where the following > > extra classes exist for the corresponding function calls: if, > > while, for, =, <-, (, {, call. > > So, I think clearly { for S3, not S4 ! } > > "class attribute" := attr(x, "class") > > "implicit class" := the class(x) of R objects that do *not* > have a class attribute > > > > 4. Note that NULL objects cannot have attributes (hence not > > classes) and attempting to assign a class is an error. > > the above has one small flaw : "(hence not classes)" is not correct. > Of course class(NULL) is "NULL" by par. 3's typeof(x) "rule". > > > 5a. When a generic function ?fun? is applied to an object with class > > attribute ?c("first", "second")?, the system searches for a > > function called ?fun.first? and, if it finds it, applies it to the > > object. If no such function is found, a function called > > ?fun.second? is tried. If no class name produces a suitable > > function, the function ?fun.default? is used (if it exists). > > 5b. If there is no class attribute, the implicit class is tried, then > the > > default method. > > > 6. The function 'class' prints the vector of names of classes an > > object inherits from. Correspondingly, class<- sets the classes > > an object inherits from. Assigning NULL removes the class > > attribute. > > ["of course", the word "prints" above should be replaced by "returns" ! ] > > > 7. 'unclass' returns (a copy of) its argument with its class > > attribute removed. (It is not allowed for objects which cannot be > > copied, namely environments and external pointers.) > > > 8. 'inherits' indicates whether its first argument inherits from any > > of the classes specified in the ?what? argument. If which is > > TRUE then an integer vector of the same length as ?what? is > > returned. Each element indicates the position in the ?class(x)? > > matched by the element of ?what?; zero indicates no match. If > > which is FALSE then TRUE is returned by inherits if any of > > the names in ?what? match with any class. > > {I had forgotten that the 2nd argument of inherits, 'what', can > be a vector and about the 'which' argument} > > > >> On 30 Oct 2019, at 12:29 , Martin Maechler < > maechler at stat.math.ethz.ch> wrote: > >> > >> Note however the following historical quirk : > >> > >>> sapply(setNames(,1:5), function(K) inherits(array(pi, dim=1:K), > "array")) > >> 1 2 3 4 5 > >> TRUE FALSE TRUE TRUE TRUE > >> > >> (Is this something we should consider changing for R 4.0.0 -- to > >> have it TRUE also for 2d-arrays aka matrix objects ??) > > > -- > > Peter Dalgaard, Professor, > > Center for Statistics, Copenhagen Business School > > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > > Phone: (+45)38153501 > > Office: A 4.23 > > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > >[[alternative HTML version deleted]]
Apparently Analagous Threads
- class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]
- class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]
- class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]
- head.matrix can return 1000s of columns -- limit to n or add new argument?
- class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]