thr3ads.net - R devel - [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument? [Nov 2019]

If this information is useful, please help other people find it:
Share via:

Gabriel Becker

2019-Nov-02 19:37 UTC

[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

Thanks Martin and Peter,

I agree that we can be careful and narrow and still see a nice improvement
in behavior. While Herve's point is valid and I understand his frustration,
I think staying within the matrix vs  c(matrix, array) space is the right
scope for this work in terms of fiddling with inheritance.

As another point,  I don't know off the top of my head of any other classes
which we would expect to have a dimensions attribute other than arrays
(including the "non-array" 2d matrices) and data.frames, but I imagine
there are some out there.

Do we want the default head and tail methods to be dimension aware as well,
via something along the lines of what I had in my previous message, or do
we want to retain the old behavior for things that aren't data.frames or
matrix/arrays? If the dim attribute can always be assumed to mean the same
thing I feel like it would be nice to give the dimensionality awareness
(and idempotence) to anything with dimensions, but again I don't know much
about the other classes taht have that attribute or how people want to use
them.

It would of course be written in a way that still worked identically to now
for any object that does not have a dimension attribute.

Thoughts?

~G

On Fri, Nov 1, 2019 at 1:52 AM Martin Maechler <maechler at
stat.math.ethz.ch>
wrote:
> >>>>> peter dalgaard
> >>>>>     on Thu, 31 Oct 2019 23:04:29 +0100 writes:
>
>     > Hmm, the problem I see here is that these implied classes are all
> inherently one-off. We also have
>     >> inherits(matrix(1,1,1),"numeric")
>     > [1] FALSE
>     >> is.numeric(matrix(1,1,1))
>     > [1] TRUE
>     >> inherits(1L,"numeric")
>     > [1] FALSE
>     >> is.numeric(1L)
>     > [1] TRUE
>
>     > and if we start fixing one, we might need to fix all.
>
> I disagree about "fixing all" (see also my reply to Herv?), and
> the {"numeric","double","integer"} case is
particularly messy,
> and I don't want to open that can now.
>
>     > For method dispatch, we do have inheritance, e.g.
>
>     >> foo.numeric <- function(x) x + 1
>     >> foo <- function(x) UseMethod("foo")
>     >> foo(1)
>     > [1] 2
>     >> foo(1L)
>     > [1] 2
>     >> foo(matrix(1,1,1))
>     > [,1]
>     > [1,]    2
>     >> foo.integer <- function(x) x + 2
>     >> foo(1)
>     > [1] 2
>     >> foo(1L)
>     > [1] 3
>     >> foo(matrix(1,1,1))
>     > [,1]
>     > [1,]    2
>     >> foo(matrix(1L,1,1))
>     > [,1]
>     > [1,]    3
>
>     > but these are not all automatic: "integer" implies
"numeric", but
> "matrix" does not imply "numeric", much less
"integer".
>
> well it should not imply in general:
> Contrary to Math,  we also have 'raw' or 'character' or
'logical' matrices.
>
>
>     > Also, we seem to have a rule that inherits(x, c)  iff  c %in%
> class(x),
>
> good point, and that's why my usage of  inherits(.,.) was not
> quite to the point.  [OTOH, it was to the point, as indeed from
>       the ?class / ?inherits docu, S3 method dispatch and inherits
>       must be consistent ]
>
>     > which would break -- unless we change class(x) to return the whole
> set of inherited classes, which I sense that we'd rather not do....
>
> and we have something like that already with  is(.)
>
> Thank you for these important points raised!
>
> Note again that both "matrix" and "array" are special
[see ?class] as
> being of  __implicit class__  and I am considering that this
> implicit class behavior for these two should be slightly changed
> such that
>
>   foo <- function(x,...) UseMethod("foo")
>   foo.array <- function(x, ...)
>            sprintf("array of dim. %s", paste(dim(x), collapse =
" x "))
>
> should work for all arrays and not be an exception for 2D arrays :
>
> > foo(array(pi, 1:3))
> [1] "array of dim. 1 x 2 x 3"
> > foo(array(pi, 1))
> [1] "array of dim. 1"
> > foo(array(pi, 2:7))
> [1] "array of dim. 2 x 3 x 4 x 5 x 6 x 7"
> > foo(array(pi, 1:2))
> Error in UseMethod("foo") :
>   no applicable method for 'foo' applied to an object of class
> "c('matrix', 'double', 'numeric')"
> >
>
> And indeed I think you are right on spot and this would mean
> that indeed the implicit class
> "matrix" should rather become c("matrix",
"array").
>
> BTW: The 'Details' section of   ?class   nicely defines things,
>      notably the __implicit class__ situation
>      (but I think should be improved)  :
>
>      {numbering the paragraphs for reference}
>
> > Details:
> >
> > 1.   Here, we describe the so called ?S3? classes (and methods). For
> >      ?S4? classes (and methods), see ?Formal classes? below.
> >
> > 2.   Many R objects have a class attribute, a character vector giving
> >      the names of the classes from which the object _inherits_.
> >      (Functions oldClass and oldClass<- get and set the attribute,
> >      which can also be done directly.)
> >
> > 3.   If the object does not have a class attribute, it has an implicit
> >      class, notably ?"matrix"?, ?"array"?,
?"function"? or ?"numeric"?
> >      or the result of ?typeof(x)? (which is similar to ?mode(x)?), but
> >      for type ?"language"? and mode ?"call"?,
where the following
> >      extra classes exist for the corresponding function calls: if,
> >      while, for, =, <-, (, {, call.
>
> So, I think clearly  { for S3, not S4 ! }
>
>   "class attribute" :=  attr(x, "class")
>
>   "implicit class" := the class(x) of R objects that do *not*
>                       have a class attribute
>
>
> > 4.   Note that NULL objects cannot have attributes (hence not
> >      classes) and attempting to assign a class is an error.
>
> the above has one small flaw : "(hence not classes)" is not
correct.
> Of course   class(NULL) is "NULL" by par. 3's  typeof(x)
"rule".
>
> > 5a.  When a generic function ?fun? is applied to an object with class
> >      attribute ?c("first", "second")?, the system
searches for a
> >      function called ?fun.first? and, if it finds it, applies it to
the
> >      object.  If no such function is found, a function called
> >      ?fun.second? is tried.  If no class name produces a suitable
> >      function, the function ?fun.default? is used (if it exists).
> > 5b.  If there is no class attribute, the implicit class is tried, then
> the
> >      default method.
>
> > 6.   The function 'class' prints the vector of names of
classes an
> >      object inherits from.  Correspondingly, class<- sets the
classes
> >      an object inherits from.  Assigning NULL removes the class
> >      attribute.
>
> ["of course", the word  "prints" above should be
replaced by "returns" ! ]
>
> > 7.   'unclass' returns (a copy of) its argument with its class
> >      attribute removed.  (It is not allowed for objects which cannot
be
> >      copied, namely environments and external pointers.)
>
> > 8.   'inherits' indicates whether its first argument inherits
from any
> >      of the classes specified in the ?what? argument.  If which is
> >      TRUE then an integer vector of the same length as ?what? is
> >      returned.  Each element indicates the position in the ?class(x)?
> >      matched by the element of ?what?; zero indicates no match. If
> >      which is FALSE then TRUE is returned by inherits if any of
> >      the names in ?what? match with any class.
>
> {I had forgotten that the 2nd argument of inherits, 'what', can
>  be a vector and about the 'which' argument}
>
>
>     >> On 30 Oct 2019, at 12:29 , Martin Maechler <
> maechler at stat.math.ethz.ch> wrote:
>     >>
>     >> Note however the following  historical quirk :
>     >>
>     >>> sapply(setNames(,1:5), function(K) inherits(array(pi,
dim=1:K),
> "array"))
>     >> 1     2     3     4     5
>     >> TRUE FALSE  TRUE  TRUE  TRUE
>     >>
>     >> (Is this something we should consider changing for R 4.0.0 --
to
>     >> have it TRUE also for 2d-arrays aka matrix objects ??)
>
>     > --
>     > Peter Dalgaard, Professor,
>     > Center for Statistics, Copenhagen Business School
>     > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>     > Phone: (+45)38153501
>     > Office: A 4.23
>     > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
	[[alternative HTML version deleted]]

Gabriel Becker

2019-Nov-02 19:40 UTC

head link

[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

As I hit send I realized I did know of one, which is table objects. So
while we're discussing it we can talk about both generally and specifically
what head.table and tail.table should do. Looks like tail.table is already
special -cased to hit the matrix method if it is 2d, so the natural
extension of that would be hitting tail.array for any 2+d table, I think.

~G

On Sat, Nov 2, 2019 at 12:37 PM Gabriel Becker <gabembecker at gmail.com>
wrote:
> Thanks Martin and Peter,
>
> I agree that we can be careful and narrow and still see a nice improvement
> in behavior. While Herve's point is valid and I understand his
frustration,
> I think staying within the matrix vs  c(matrix, array) space is the right
> scope for this work in terms of fiddling with inheritance.
>
> As another point,  I don't know off the top of my head of any other
> classes which we would expect to have a dimensions attribute other than
> arrays (including the "non-array" 2d matrices) and data.frames,
but I
> imagine there are some out there.
>
> Do we want the default head and tail methods to be dimension aware as
> well, via something along the lines of what I had in my previous message,
> or do we want to retain the old behavior for things that aren't
data.frames
> or matrix/arrays? If the dim attribute can always be assumed to mean the
> same thing I feel like it would be nice to give the dimensionality
> awareness (and idempotence) to anything with dimensions, but again I
don't
> know much about the other classes taht have that attribute or how people
> want to use them.
>
> It would of course be written in a way that still worked identically to
> now for any object that does not have a dimension attribute.
>
> Thoughts?
>
> ~G
>
> On Fri, Nov 1, 2019 at 1:52 AM Martin Maechler <maechler at
stat.math.ethz.ch>
> wrote:
>
>> >>>>> peter dalgaard
>> >>>>>     on Thu, 31 Oct 2019 23:04:29 +0100 writes:
>>
>>     > Hmm, the problem I see here is that these implied classes are
all
>> inherently one-off. We also have
>>     >> inherits(matrix(1,1,1),"numeric")
>>     > [1] FALSE
>>     >> is.numeric(matrix(1,1,1))
>>     > [1] TRUE
>>     >> inherits(1L,"numeric")
>>     > [1] FALSE
>>     >> is.numeric(1L)
>>     > [1] TRUE
>>
>>     > and if we start fixing one, we might need to fix all.
>>
>> I disagree about "fixing all" (see also my reply to Herv?),
and
>> the {"numeric","double","integer"} case
is particularly messy,
>> and I don't want to open that can now.
>>
>>     > For method dispatch, we do have inheritance, e.g.
>>
>>     >> foo.numeric <- function(x) x + 1
>>     >> foo <- function(x) UseMethod("foo")
>>     >> foo(1)
>>     > [1] 2
>>     >> foo(1L)
>>     > [1] 2
>>     >> foo(matrix(1,1,1))
>>     > [,1]
>>     > [1,]    2
>>     >> foo.integer <- function(x) x + 2
>>     >> foo(1)
>>     > [1] 2
>>     >> foo(1L)
>>     > [1] 3
>>     >> foo(matrix(1,1,1))
>>     > [,1]
>>     > [1,]    2
>>     >> foo(matrix(1L,1,1))
>>     > [,1]
>>     > [1,]    3
>>
>>     > but these are not all automatic: "integer" implies
"numeric", but
>> "matrix" does not imply "numeric", much less
"integer".
>>
>> well it should not imply in general:
>> Contrary to Math,  we also have 'raw' or 'character' or
'logical'
>> matrices.
>>
>>
>>     > Also, we seem to have a rule that inherits(x, c)  iff  c %in%
>> class(x),
>>
>> good point, and that's why my usage of  inherits(.,.) was not
>> quite to the point.  [OTOH, it was to the point, as indeed from
>>       the ?class / ?inherits docu, S3 method dispatch and inherits
>>       must be consistent ]
>>
>>     > which would break -- unless we change class(x) to return the
whole
>> set of inherited classes, which I sense that we'd rather not do....
>>
>> and we have something like that already with  is(.)
>>
>> Thank you for these important points raised!
>>
>> Note again that both "matrix" and "array" are
special [see ?class] as
>> being of  __implicit class__  and I am considering that this
>> implicit class behavior for these two should be slightly changed
>> such that
>>
>>   foo <- function(x,...) UseMethod("foo")
>>   foo.array <- function(x, ...)
>>            sprintf("array of dim. %s", paste(dim(x), collapse
= " x "))
>>
>> should work for all arrays and not be an exception for 2D arrays :
>>
>> > foo(array(pi, 1:3))
>> [1] "array of dim. 1 x 2 x 3"
>> > foo(array(pi, 1))
>> [1] "array of dim. 1"
>> > foo(array(pi, 2:7))
>> [1] "array of dim. 2 x 3 x 4 x 5 x 6 x 7"
>> > foo(array(pi, 1:2))
>> Error in UseMethod("foo") :
>>   no applicable method for 'foo' applied to an object of class
>> "c('matrix', 'double', 'numeric')"
>> >
>>
>> And indeed I think you are right on spot and this would mean
>> that indeed the implicit class
>> "matrix" should rather become c("matrix",
"array").
>>
>> BTW: The 'Details' section of   ?class   nicely defines things,
>>      notably the __implicit class__ situation
>>      (but I think should be improved)  :
>>
>>      {numbering the paragraphs for reference}
>>
>> > Details:
>> >
>> > 1.   Here, we describe the so called ?S3? classes (and methods).
For
>> >      ?S4? classes (and methods), see ?Formal classes? below.
>> >
>> > 2.   Many R objects have a class attribute, a character vector
giving
>> >      the names of the classes from which the object _inherits_.
>> >      (Functions oldClass and oldClass<- get and set the
attribute,
>> >      which can also be done directly.)
>> >
>> > 3.   If the object does not have a class attribute, it has an
implicit
>> >      class, notably ?"matrix"?, ?"array"?,
?"function"? or ?"numeric"?
>> >      or the result of ?typeof(x)? (which is similar to ?mode(x)?),
but
>> >      for type ?"language"? and mode ?"call"?,
where the following
>> >      extra classes exist for the corresponding function calls: if,
>> >      while, for, =, <-, (, {, call.
>>
>> So, I think clearly  { for S3, not S4 ! }
>>
>>   "class attribute" :=  attr(x, "class")
>>
>>   "implicit class" := the class(x) of R objects that do *not*
>>                       have a class attribute
>>
>>
>> > 4.   Note that NULL objects cannot have attributes (hence not
>> >      classes) and attempting to assign a class is an error.
>>
>> the above has one small flaw : "(hence not classes)" is not
correct.
>> Of course   class(NULL) is "NULL" by par. 3's  typeof(x)
"rule".
>>
>> > 5a.  When a generic function ?fun? is applied to an object with
class
>> >      attribute ?c("first", "second")?, the
system searches for a
>> >      function called ?fun.first? and, if it finds it, applies it
to the
>> >      object.  If no such function is found, a function called
>> >      ?fun.second? is tried.  If no class name produces a suitable
>> >      function, the function ?fun.default? is used (if it exists).
>> > 5b.  If there is no class attribute, the implicit class is tried,
then
>> the
>> >      default method.
>>
>> > 6.   The function 'class' prints the vector of names of
classes an
>> >      object inherits from.  Correspondingly, class<- sets the
classes
>> >      an object inherits from.  Assigning NULL removes the class
>> >      attribute.
>>
>> ["of course", the word  "prints" above should be
replaced by "returns" ! ]
>>
>> > 7.   'unclass' returns (a copy of) its argument with its
class
>> >      attribute removed.  (It is not allowed for objects which
cannot be
>> >      copied, namely environments and external pointers.)
>>
>> > 8.   'inherits' indicates whether its first argument
inherits from any
>> >      of the classes specified in the ?what? argument.  If which is
>> >      TRUE then an integer vector of the same length as ?what? is
>> >      returned.  Each element indicates the position in the
?class(x)?
>> >      matched by the element of ?what?; zero indicates no match. If
>> >      which is FALSE then TRUE is returned by inherits if any of
>> >      the names in ?what? match with any class.
>>
>> {I had forgotten that the 2nd argument of inherits, 'what', can
>>  be a vector and about the 'which' argument}
>>
>>
>>     >> On 30 Oct 2019, at 12:29 , Martin Maechler <
>> maechler at stat.math.ethz.ch> wrote:
>>     >>
>>     >> Note however the following  historical quirk :
>>     >>
>>     >>> sapply(setNames(,1:5), function(K) inherits(array(pi,
dim=1:K),
>> "array"))
>>     >> 1     2     3     4     5
>>     >> TRUE FALSE  TRUE  TRUE  TRUE
>>     >>
>>     >> (Is this something we should consider changing for R 4.0.0
-- to
>>     >> have it TRUE also for 2d-arrays aka matrix objects ??)
>>
>>     > --
>>     > Peter Dalgaard, Professor,
>>     > Center for Statistics, Copenhagen Business School
>>     > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>     > Phone: (+45)38153501
>>     > Office: A 4.23
>>     > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
	[[alternative HTML version deleted]]

Martin Maechler

2019-Nov-10 08:36 UTC

head link

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

>>>>> Gabriel Becker 
>>>>>     on Sat, 2 Nov 2019 12:37:08 -0700 writes:
    > I agree that we can be careful and narrow and still see a
    > nice improvement in behavior. While Herve's point is valid
    > and I understand his frustration, I think staying within
    > the matrix vs c(matrix, array) space is the right scope
    > for this work in terms of fiddling with inheritance.

 [.................]

> > Also, we seem to have a rule that inherits(x, c)  iff  c %in%
class(x),
>
> good point, and that's why my usage of  inherits(.,.) was not
> quite to the point.  [OTOH, it was to the point, as indeed from
>       the ?class / ?inherits docu, S3 method dispatch and inherits
>       must be consistent ]
>
>     > which would break -- unless we change class(x) to return the whole
> set of inherited classes, which I sense that we'd rather not do....
  [................]
> Note again that both "matrix" and "array" are special
[see ?class] as
> being of  __implicit class__  and I am considering that this
> implicit class behavior for these two should be slightly
> changed ....
>
> And indeed I think you are right on spot and this would mean
> that indeed the implicit class
> "matrix" should rather become c("matrix",
"array").
I've made up my mind (and not been contradicted by my fellow R
corers) to try go there for  R 4.0.0   next April.

I've found the few places in base R that needed a change (to
pass 'make check-all' in the R sources) and found that indeed a
overzealous check in 'Matrix' needed also a change (a place
where the checking code assume  class(<matrix>) |--> "matrix"
).

There are certainly many more package (codes and checks) that
need adaption .. i.e., should be changed rather *before* the
above change is activated in R-devel (and then will affect all CRAN
and Bioconductor checks.)

To this end, I've published an  'R Blog' yesterday,

   bit.ly/R_blog_class_think_2x

which translates to

  
developer.r-project.org/Blog/public/2019/11/09/when-you-think-class.-think-again/index.html

notably mentioning why using  class(x) == "...."  (or '!=') 
or
switch(class(.) ...)  is quite unsafe and hence bad and you
should very often not replace  class(x)  by  class(x)[1]  but
really use the "only truly correct" ;-)

     inherits(x,  "...")
or
     is(x,  "....")   # if you're advanced/brave enough (:-) to
     	    	      # use formal classes (S4)
     
Martin Maechler
ETH Zurich and R Core Team

Bryan Hanson

2019-Nov-10 14:17 UTC

head link

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

> On Nov 10, 2019, at 3:36 AM, Martin Maechler <maechler at
stat.math.ethz.ch> wrote:
> 
>>>>>> Gabriel Becker 
>>>>>>    on Sat, 2 Nov 2019 12:37:08 -0700 writes:
> 
>> I agree that we can be careful and narrow and still see a
>> nice improvement in behavior. While Herve's point is valid
>> and I understand his frustration, I think staying within
>> the matrix vs c(matrix, array) space is the right scope
>> for this work in terms of fiddling with inheritance.
> 
> [.................]
> 
> 
>>> Also, we seem to have a rule that inherits(x, c)  iff  c %in%
class(x),
>> 
>> good point, and that's why my usage of  inherits(.,.) was not
>> quite to the point.  [OTOH, it was to the point, as indeed from
>>      the ?class / ?inherits docu, S3 method dispatch and inherits
>>      must be consistent ]
>> 
>>> which would break -- unless we change class(x) to return the whole
>> set of inherited classes, which I sense that we'd rather not do....
> 
>  [................]
> 
>> Note again that both "matrix" and "array" are
special [see ?class] as
>> being of  __implicit class__  and I am considering that this
>> implicit class behavior for these two should be slightly
>> changed ....
>> 
>> And indeed I think you are right on spot and this would mean
>> that indeed the implicit class
>> "matrix" should rather become c("matrix",
"array").
> 
> I've made up my mind (and not been contradicted by my fellow R
> corers) to try go there for  R 4.0.0   next April.
> 
> I've found the few places in base R that needed a change (to
> pass 'make check-all' in the R sources) and found that indeed a
> overzealous check in 'Matrix' needed also a change (a place
> where the checking code assume  class(<matrix>) |-->
"matrix" ).
> 
> There are certainly many more package (codes and checks) that
> need adaption .. i.e., should be changed rather *before* the
> above change is activated in R-devel (and then will affect all CRAN
> and Bioconductor checks.)
> 
> To this end, I've published an  'R Blog' yesterday,
> 
>   bit.ly/R_blog_class_think_2x
> 
> which translates to
> 
>  
developer.r-project.org/Blog/public/2019/11/09/when-you-think-class.-think-again/index.html
> 
> notably mentioning why using  class(x) == "...."  (or
'!=')  or
> switch(class(.) ...)  is quite unsafe and hence bad and you
> should very often not replace  class(x)  by  class(x)[1]  but
> really use the "only truly correct" ;-)
> 
>     inherits(x,  "...")
> or
>     is(x,  "....")   # if you're advanced/brave enough (:-)
to
>     	    	      # use formal classes (S4)
Thanks for the helpful blog post Martin. Is the following

  ?test_class?  %in% class(some_object)

which I think in your symbols would be

  ??? %in% class(x)

safe as far as you see it? By safe, I mean equivalent to your suggestion of
inherits(x, ???) .

Thanks, Bryan

Hadley Wickham

2019-Nov-14 13:47 UTC

head link

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

On Sun, Nov 10, 2019 at 2:37 AM Martin Maechler
<maechler at stat.math.ethz.ch> wrote:>
> >>>>> Gabriel Becker
> >>>>>     on Sat, 2 Nov 2019 12:37:08 -0700 writes:
>
>     > I agree that we can be careful and narrow and still see a
>     > nice improvement in behavior. While Herve's point is valid
>     > and I understand his frustration, I think staying within
>     > the matrix vs c(matrix, array) space is the right scope
>     > for this work in terms of fiddling with inheritance.
>
>  [.................]
>
>
> > > Also, we seem to have a rule that inherits(x, c)  iff  c %in%
class(x),
> >
> > good point, and that's why my usage of  inherits(.,.) was not
> > quite to the point.  [OTOH, it was to the point, as indeed from
> >       the ?class / ?inherits docu, S3 method dispatch and inherits
> >       must be consistent ]
> >
> >     > which would break -- unless we change class(x) to return the
whole
> > set of inherited classes, which I sense that we'd rather not
do....
>
>   [................]
>
> > Note again that both "matrix" and "array" are
special [see ?class] as
> > being of  __implicit class__  and I am considering that this
> > implicit class behavior for these two should be slightly
> > changed ....
> >
> > And indeed I think you are right on spot and this would mean
> > that indeed the implicit class
> > "matrix" should rather become c("matrix",
"array").
>
> I've made up my mind (and not been contradicted by my fellow R
> corers) to try go there for  R 4.0.0   next April.
I can't seem to find the previous thread, so would you mind being a
bit more explicit here? Do you mean adding "array" to the implicit
class? Or adding it to the explicit class? Or adding it to inherits?
i.e. which of the following results are you proposing to change?

is_array <- function(x) UseMethod("is_array")
is_array.array <- function(x) TRUE
is_array.default <- function(x) FALSE

x <- matrix()
is_array(x)
#> [1] FALSE
x <- matrix()
inherits(x, "array")
#> [1] FALSE
class(x)
#> [1] "matrix"

It would be nice to make sure this is consistent with the behaviour of
integers, which have an implicit parent class of numeric:

is_numeric <- function(x) UseMethod("is_numeric")
is_numeric.numeric <- function(x) TRUE
is_numeric.default <- function(x) FALSE

x <- 1L
is_numeric(x)
#> [1] TRUE
inherits(x, "numeric")
#> [1] FALSE
class(x)
#> [1] "integer"

Hadley

-- 
hadley.nz

Abby Spurdle

2019-Nov-15 21:19 UTC

head link

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

> > And indeed I think you are right on spot and this would mean
> > that indeed the implicit class
> > "matrix" should rather become c("matrix",
"array").
>
> I've made up my mind (and not been contradicted by my fellow R
> corers) to try go there for  R 4.0.0   next April.
I'm not enthusiastic about matrices extending arrays.
If a matrix is an array, then shouldn't all vectors in R, be arrays too?
> #mockup
> class (1)[1] "numeric" "array"

Which is a bad idea.
It contradicts the central principle that R uses "Vectors" rather than
"Arrays".
And I feel that matrices are and should be, a special case of vectors.
(With their inheritance from vectors taking precedence over anything else).

If the motivation is to solve the problem of 2D arrays, automatically
being mapped to matrices:
> class (array (1, c (2, 2) ) )[1] "matrix"

Then wouldn't it be better, to treat 2D arrays, as a special case, and
leave matrices as they are?
> #mockup
> class (array (1, c (2, 2) ) )[1] "array2d" "matrix" "array"

Then 2D arrays would have access to both matrix and array methods...

Note, I don't want to enter into (another) discussion on the
differences between implicit class and classes defined via a class
attribute.
That's another discussion, which has little to do with my points above.

Martin Maechler

2019-Nov-28 14:30 UTC

head link

[Rd] head.matrix can return 1000s of columns ..

>>>>> Gabriel Becker 
>>>>>     on Sat, 2 Nov 2019 12:40:16 -0700 writes:
    [....................]

In the mean time,  Gabe had worked quite a bit and provided a
patch proposal  at R's bugzilla,  PR#17652 ,
i.e., here
      bugs.r-project.org/bugzilla/show_bug.cgi?id=17652

A few days ago, I had committed a (slightly simplified) version
of that to R-devel (svn rev 77462 )
with NEWS entry

    * head(x, n) and tail() default and other S3 methods notably for
      _vector_ n, e.g. to get a "corner" of a matrix, also extended
for
      array's of higher dimension, thanks to the patch proposal by Gabe
      Becker in PR#16764.

 (which contains a *wrong* PR number that I've corrected in the
  mean time)

A day or so later, the CRAN has alerted me to the fact that this
change breaks the checks of some CRAN packages, as it seems
about 30 now.

There were at least two principal reasons, one of which was the
fact that data frame subsetting has been somewhat surprising in R,
without being documented so, *and* some packages have
inadvertently made use of this pecularity -- which was
inadvertently changed by r77462.

In short,   head(<data frame>)  kept extraneous attributes
because indeed
                d[i, ]
keeps those attributes ... for data frames.

I will amend the  head() and tail() methods to remain back
compatible (as much as sensible) for now,  but here's what I've
found about subsetting, i.e., behavior of the (partly C code
internal)  `[`  methods in R :

1)  For a data frame d,  d[i, ]  differs  from  d[i,j],
    as the former keeps (extra) attributes,
2)  For a matrix both forms of indexing do not keep (extra) attributes.

Here's some simple reproducible R code exhibiting the claim:

##==== Data frame subsetting (vs. matrix, array)  "with extra
attributes": ====## data frame w/ a (non-standard) attribute:
str(treeS <- structure(trees, foo = "bar"))

chkMat <- function(M) {
    stopifnot(nzchar(Mfoo <- attr(M, "foo")),
              length(d <- dim(M)) == 2,
              (n <- d[1]) >= 6, d[2] >= 3)
    ## n = nrow(M)
    stopifnot(exprs = { # attribute is kept
        if(inherits(M, "data.frame")) {
            identical(  attr(M[    1:3 , ] , "foo") , "bar")
&&
            identical(  attr(M[(n-2):n , ] , "foo") , "bar")
        } else { ## matrix
            is.null  (  attr(M[    1:3 , ] , "foo")) &&
            is.null  (  attr(M[(n-2):n , ] , "foo"))
        }
        ## OTOH,  [i,j]-indexing of data frames *does* drop "other"
attributes:
        inherits(print(t.ij <- M[(n-2):n, 2:3] ), class(M))
        ## now, the "foo" attribute of  M[i,j] is gone!
        is.null(attr(t.ij, "foo"))
    })
}

chkMat(treeS)
chkMat(as.matrix(treeS))

-------

And (to repeat), currently  head(d, n)  is the same as   d[1:n , ]
when n >= 1,  length(n) == 1  and this equality is relied upon
by CRAN package code out there .. and hence I'll keep it with
the "generalized" head() & tail() in R-devel.

Martin

Reasonably Related Threads

Search for more possibly parallel threads

R devel - Nov 2019 - head.matrix can return 1000s of columns -- limit to n or add new argument?

[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

[Rd] class(<matrix>) |--> c("matrix", "arrary") [was "head.matrix ..."]

[Rd] head.matrix can return 1000s of columns ..

Reasonably Related Threads