thr3ads.net - R devel - [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Gabriel Becker

2016-Sep-08 15:43 UTC

[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

Martin,

Like Robin and Oliver I think this type of edge-case consistency is
important and that it's fantastic that R-core - and you personally - are
willing to tackle some of these "gotcha" behaviors. "Little"
stuff like
this really does combine to go a long way to making R better and better.

I do wonder a  bit about the

x = 1:2

y = NULL

x < y

case.

Returning a logical of length 0 is more backwards compatible, but is it
ever what the author actually intended? I have trouble thinking of a case
where that less-than didn't carry an implicit assumption that y was
non-NULL.  I can say that in my own code, I've never hit that behavior in a
case that wasn't an error.

My vote (unless someone else points out a compelling use for the behavior)
is for the to throw an error. As a developer, I'd rather things like this
break so the bug in my logic is visible, rather than  propagating as the
0-length logical is &'ed or |'ed with other logical vectors, or used
to
subset, or (in the case it should be length 1) passed to if() (if throws an
error now, but the rest would silently "work").

Best,
~G

On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <maechler at
stat.math.ethz.ch>
wrote:
> >>>>> robin hankin <hankin.robin at gmail.com>
> >>>>>     on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>
>     > Martin I'd like to make a comment; I think that R's
>     > behaviour on 'edge' cases like this is an important thing
>     > and it's great that you are working on it.
>
>     > I make heavy use of zero-extent arrays, chiefly because
>     > the dimnames are an efficient and logical way to keep
>     > track of certain types of information.
>
>     > If I have, for example,
>
>     > a <- array(0,c(2,0,2))
>     > dimnames(a) <- list(name=c('Mike','Kevin'),
> NULL,item=c("hat","scarf"))
>
>
>     > Then in R-3.3.1, 70800 I get
>
>     a> 0
>     > logical(0)
>     >>
>
>     > But in 71219 I get
>
>     a> 0
>     > , , item = hat
>
>
>     > name
>     > Mike
>     > Kevin
>
>     > , , item = scarf
>
>
>     > name
>     > Mike
>     > Kevin
>
>     > (which is an empty logical array that holds the names of the
people
> and
>     > their clothes). I find the behaviour of 71219 very much preferable
> because
>     > there is no reason to discard the information in the dimnames.
>
> Thanks a lot, Robin, (and Oliver) !
>
> Yes, the above is such a case where the new behavior makes much sense.
> And this behavior remains identical after the 71222 amendment.
>
> Martin
>
>     > Best wishes
>     > Robin
>
>
>
>
>     > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
> maechler at stat.math.ethz.ch>
>     > wrote:
>
>     >> >>>>> Martin Maechler <maechler at
stat.math.ethz.ch>
>     >> >>>>>     on Tue, 6 Sep 2016 22:26:31 +0200
writes:
>     >>
>     >> > Yesterday, changes to R's development version were
committed,
>     >> relating
>     >> > to arithmetic, logic ('&' and '|')
and
>     >> > comparison/relational ('<', '==')
binary operators
>     >> > which in NEWS are described as
>     >>
>     >> > SIGNIFICANT USER-VISIBLE CHANGES:
>     >>
>     >> > [.............]
>     >>
>     >> > ? Arithmetic, logic (?&?, ?|?) and comparison (aka
>     >> > ?relational?, e.g., ?<?, ?==?) operations with arrays
now
>     >> > behave consistently, notably for arrays of length zero.
>     >>
>     >> > Arithmetic between length-1 arrays and longer non-arrays
had
>     >> > silently dropped the array attributes and recycled.  This
>     >> > now gives a warning and will signal an error in the
future,
>     >> > as it has always for logic and comparison operations in
>     >> > these cases (e.g., compare ?matrix(1,1) + 2:3? and
>     >> > ?matrix(1,1) < 2:3?).
>     >>
>     >> > As the above "visually suggests" one could
think of the changes
>     >> > falling mainly two groups,
>     >> > 1) <0-extent array>  (op)     <non-array>
>     >> > 2) <1-extent array>  (arith)  <non-array of
length != 1>
>     >>
>     >> > These changes are partly non-back compatible and may
break
>     >> > existing code.  We believe that the internal consistency
gained
>     >> > from the changes is worth the few places with problems.
>     >>
>     >> > We expect some package maintainers (10-20, or even more?)
need
>     >> > to adapt their code.
>     >>
>     >> > Case '2)' above mainly results in a new warning,
e.g.,
>     >>
>     >> >> matrix(1,1) + 1:2
>     >> > [1] 2 3
>     >> > Warning message:
>     >> > In matrix(1, 1) + 1:2 :
>     >> > dropping dim() of array of length one.  Will become ERROR
>     >> >>
>     >>
>     >> > whereas '1)' gives errors in cases the result
silently was a
>     >> > vector of length zero, or also keeps array (dim &
dimnames) in
>     >> > cases these were silently dropped.
>     >>
>     >> > The following is a "heavily" commented  R
script showing (all ?)
>     >> > the important cases with changes :
>     >>
>     >> >
------------------------------------------------------------
>     >> ----------------
>     >>
>     >> > (m <- cbind(a=1[0], b=2[0]))
>     >> > Lm <- m; storage.mode(Lm) <- "logical"
>     >> > Im <- m; storage.mode(Im) <- "integer"
>     >>
>     >> > ## 1. -------------------------
>     >> > try( m & NULL ) # in R <= 3.3.x :
>     >> > ## Error in m & NULL :
>     >> > ##  operations are possible only for numeric, logical or
complex
>     >> types
>     >> > ##
>     >> > ## gives 'Lm' in R >= 3.4.0
>     >>
>     >> > ## 2. -------------------------
>     >> > m + 2:3 ## gave numeric(0), now remains matrix identical
to  m
>     >> > Im + 2:3 ## gave integer(0), now remains matrix identical
to Im
>     >> (integer)
>     >>
>     >> > m > 1      ## gave logical(0), now remains matrix
identical to Lm
>     >> (logical)
>     >> > m > 0.1[0] ##  ditto
>     >> > m > NULL   ##  ditto
>     >>
>     >> > ## 3. -------------------------
>     >> > mm <- m[,c(1:2,2:1,2)]
>     >> > try( m == mm ) ## now gives error   "non-conformable
arrays",
>     >> > ## but gave logical(0) in R <= 3.3.x
>     >>
>     >> > ## 4. -------------------------
>     >> > str( Im + NULL)  ## gave "num", now gives
"int"
>     >>
>     >> > ## 5. -------------------------
>     >> > ## special case for arithmetic w/ length-1 array
>     >> > (m1 <- matrix(1,1,1,
dimnames=list("Ro","col")))
>     >> > (m2 <- matrix(1,2,1,
dimnames=list(c("A","B"),"col")))
>     >>
>     >> > m1 + 1:2  # ->  2:3  but now with warning to 
"become ERROR"
>     >> > tools::assertError(m1 & 1:2)# ERR: dims [product 1]
do not match
> the
>     >> length of object [2]
>     >> > tools::assertError(m1 < 1:2)# ERR:                 
(ditto)
>     >> > ##
>     >> > ## non-0-length arrays combined with {NULL or double() or
...}
> *fail*
>     >>
>     >> > ### Length-1 arrays:  Arithmetic with |vectors| > 1 
treated array
>     >> as scalar
>     >> > m1 + NULL # gave  numeric(0) in R <= 3.3.x --- still,
*but* w/
>     >> warning to "be ERROR"
>     >> > try(m1 > NULL)    # gave  logical(0) in R <= 3.3.x
--- an *error*
>     >> now in R >= 3.4.0
>     >> > tools::assertError(m1 & NULL)    # gave and gives
error
>     >> > tools::assertError(m1 | double())# ditto
>     >> > ## m2 was slightly different:
>     >> > tools::assertError(m2 + NULL)
>     >> > tools::assertError(m2 & NULL)
>     >> > try(m2 == NULL) ## was logical(0) in R <= 3.3.x; now
error as
> above!
>     >>
>     >> >
------------------------------------------------------------
>     >> ----------------
>     >>
>     >>
>     >> > Note that in R's own  'nls'  sources, there
was one case of
>     >> > situation '2)' above, i.e. a  1x1-matrix was used
as a "scalar".
>     >>
>     >> > In such cases, you should explicitly coerce it to a
vector,
>     >> > either ("self-explainingly") by  as.vector(.),
or as I did in
>     >> > the nls case  by  c(.) :  The latter is much less
>     >> > self-explaining, but nicer to read in mathematical
formulae, and
>     >> > currently also more efficient because it is a .Primitive.
>     >>
>     >> > Please use R-devel with your code, and let us know if you
see
>     >> > effects that seem adverse.
>     >>
>     >> I've been slightly surprised (or even
"frustrated") by the empty
>     >> reaction on our R-devel list to this post.
>     >>
>     >> I would have expected some critique, may be even some praise,
>     >> ... in any case some sign people are "thinking
along" (as we say
>     >> in German).
>     >>
>     >> In the mean time, I've actually thought along the one case
which
>     >> is last above:  The <op>  (binary operation) between a
>     >> non-0-length array and a 0-length vector (and NULL which
should
>     >> be treated like a 0-length vector):
>     >>
>     >> R <= 3.3.1  *is* quite inconsistent with these:
>     >>
>     >>
>     >> and my proposal above (implemented in R-devel, since Sep.5)
would
> give an
>     >> error for all these, but instead, R really could be more
lenient
> here:
>     >> A 0-length result is ok, and it should *not* inherit the array
>     >> (dim, dimnames), since the array is not of length 0. So
instead
>     >> of the above [for the very last part only!!], we would aim for
>     >> the following. These *all* give an error in current R-devel,
>     >> with the exception of 'm1 + NULL' which
"only" gives a "bad
>     >> warning" :
>     >>
>     >> ------------------------
>     >>
>     >> m1 <- matrix(1,1)
>     >> m2 <- matrix(1,2)
>     >>
>     >> m1 + NULL #    numeric(0) in R <= 3.3.x ---> OK ?!
>     >> m1 > NULL #    logical(0) in R <= 3.3.x ---> OK ?!
>     >> try(m1 & NULL)    # ERROR in R <= 3.3.x ---> change
to logical(0)
> ?!
>     >> try(m1 | double())# ERROR in R <= 3.3.x ---> change to
logical(0)
> ?!
>     >> ## m2 slightly different:
>     >> try(m2 + NULL)  # ERROR in R <= 3.3.x ---> change to
double(0)  ?!
>     >> try(m2 & NULL)  # ERROR in R <= 3.3.x ---> change to
logical(0)  ?!
>     >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK ?!
>     >>
>     >> ------------------------
>     >>
>     >> This would be slightly more back-compatible than the currently
>     >> implemented proposal. Everything else I said remains true, and
>     >> I'm pretty sure most changes needed in packages would
remain to be
> done.
>     >>
>     >> Opinions ?
>     >>
>     >>
>     >>
>     >> > In some case where R-devel now gives an error but did not
>     >> > previously, we could contemplate giving another 
"warning
>     >> > .... 'to become ERROR'" if there was too
much breakage,  though
>     >> > I don't expect that.
>     >>
>     >>
>     >> > For the R Core Team,
>     >>
>     >> > Martin Maechler,
>     >> > ETH Zurich
>     >>
>     >> ______________________________________________
>     >> R-devel at r-project.org mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >>
>
>
>
>     > --
>     > Robin Hankin
>     > Neutral theorist
>     > hankin.robin at gmail.com
>
>     > [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Gabriel Becker, PhD
Associate Scientist (Bioinformatics)
Genentech Research

	[[alternative HTML version deleted]]

William Dunlap

2016-Sep-08 17:05 UTC

head link

[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)?  This is
a different case than a zero-length operand of a legitimate type.  E.g.,
     any(x < 0)
should return FALSE if x is number-like and length(x)==0 but give an error
if x is NULL.

I.e., I think the type check should be done before the length check.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbecker at ucdavis.edu>
wrote:
> Martin,
>
> Like Robin and Oliver I think this type of edge-case consistency is
> important and that it's fantastic that R-core - and you personally -
are
> willing to tackle some of these "gotcha" behaviors.
"Little" stuff like
> this really does combine to go a long way to making R better and better.
>
> I do wonder a  bit about the
>
> x = 1:2
>
> y = NULL
>
> x < y
>
> case.
>
> Returning a logical of length 0 is more backwards compatible, but is it
> ever what the author actually intended? I have trouble thinking of a case
> where that less-than didn't carry an implicit assumption that y was
> non-NULL.  I can say that in my own code, I've never hit that behavior
in a
> case that wasn't an error.
>
> My vote (unless someone else points out a compelling use for the behavior)
> is for the to throw an error. As a developer, I'd rather things like
this
> break so the bug in my logic is visible, rather than  propagating as the
> 0-length logical is &'ed or |'ed with other logical vectors, or
used to
> subset, or (in the case it should be length 1) passed to if() (if throws an
> error now, but the rest would silently "work").
>
> Best,
> ~G
>
> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
> maechler at stat.math.ethz.ch>
> wrote:
>
> > >>>>> robin hankin <hankin.robin at gmail.com>
> > >>>>>     on Thu, 8 Sep 2016 10:05:21 +1200 writes:
> >
> >     > Martin I'd like to make a comment; I think that R's
> >     > behaviour on 'edge' cases like this is an important
thing
> >     > and it's great that you are working on it.
> >
> >     > I make heavy use of zero-extent arrays, chiefly because
> >     > the dimnames are an efficient and logical way to keep
> >     > track of certain types of information.
> >
> >     > If I have, for example,
> >
> >     > a <- array(0,c(2,0,2))
> >     > dimnames(a) <-
list(name=c('Mike','Kevin'),
> > NULL,item=c("hat","scarf"))
> >
> >
> >     > Then in R-3.3.1, 70800 I get
> >
> >     a> 0
> >     > logical(0)
> >     >>
> >
> >     > But in 71219 I get
> >
> >     a> 0
> >     > , , item = hat
> >
> >
> >     > name
> >     > Mike
> >     > Kevin
> >
> >     > , , item = scarf
> >
> >
> >     > name
> >     > Mike
> >     > Kevin
> >
> >     > (which is an empty logical array that holds the names of the
people
> > and
> >     > their clothes). I find the behaviour of 71219 very much
preferable
> > because
> >     > there is no reason to discard the information in the
dimnames.
> >
> > Thanks a lot, Robin, (and Oliver) !
> >
> > Yes, the above is such a case where the new behavior makes much sense.
> > And this behavior remains identical after the 71222 amendment.
> >
> > Martin
> >
> >     > Best wishes
> >     > Robin
> >
> >
> >
> >
> >     > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
> > maechler at stat.math.ethz.ch>
> >     > wrote:
> >
> >     >> >>>>> Martin Maechler <maechler at
stat.math.ethz.ch>
> >     >> >>>>>     on Tue, 6 Sep 2016 22:26:31
+0200 writes:
> >     >>
> >     >> > Yesterday, changes to R's development version
were committed,
> >     >> relating
> >     >> > to arithmetic, logic ('&' and
'|') and
> >     >> > comparison/relational ('<', '==')
binary operators
> >     >> > which in NEWS are described as
> >     >>
> >     >> > SIGNIFICANT USER-VISIBLE CHANGES:
> >     >>
> >     >> > [.............]
> >     >>
> >     >> > ? Arithmetic, logic (?&?, ?|?) and comparison
(aka
> >     >> > ?relational?, e.g., ?<?, ?==?) operations with
arrays now
> >     >> > behave consistently, notably for arrays of length
zero.
> >     >>
> >     >> > Arithmetic between length-1 arrays and longer
non-arrays had
> >     >> > silently dropped the array attributes and recycled. 
This
> >     >> > now gives a warning and will signal an error in the
future,
> >     >> > as it has always for logic and comparison operations
in
> >     >> > these cases (e.g., compare ?matrix(1,1) + 2:3? and
> >     >> > ?matrix(1,1) < 2:3?).
> >     >>
> >     >> > As the above "visually suggests" one could
think of the changes
> >     >> > falling mainly two groups,
> >     >> > 1) <0-extent array>  (op)    
<non-array>
> >     >> > 2) <1-extent array>  (arith)  <non-array of
length != 1>
> >     >>
> >     >> > These changes are partly non-back compatible and may
break
> >     >> > existing code.  We believe that the internal
consistency gained
> >     >> > from the changes is worth the few places with
problems.
> >     >>
> >     >> > We expect some package maintainers (10-20, or even
more?) need
> >     >> > to adapt their code.
> >     >>
> >     >> > Case '2)' above mainly results in a new
warning, e.g.,
> >     >>
> >     >> >> matrix(1,1) + 1:2
> >     >> > [1] 2 3
> >     >> > Warning message:
> >     >> > In matrix(1, 1) + 1:2 :
> >     >> > dropping dim() of array of length one.  Will become
ERROR
> >     >> >>
> >     >>
> >     >> > whereas '1)' gives errors in cases the
result silently was a
> >     >> > vector of length zero, or also keeps array (dim
& dimnames) in
> >     >> > cases these were silently dropped.
> >     >>
> >     >> > The following is a "heavily" commented  R
script showing (all ?)
> >     >> > the important cases with changes :
> >     >>
> >     >> >
------------------------------------------------------------
> >     >> ----------------
> >     >>
> >     >> > (m <- cbind(a=1[0], b=2[0]))
> >     >> > Lm <- m; storage.mode(Lm) <-
"logical"
> >     >> > Im <- m; storage.mode(Im) <-
"integer"
> >     >>
> >     >> > ## 1. -------------------------
> >     >> > try( m & NULL ) # in R <= 3.3.x :
> >     >> > ## Error in m & NULL :
> >     >> > ##  operations are possible only for numeric,
logical or complex
> >     >> types
> >     >> > ##
> >     >> > ## gives 'Lm' in R >= 3.4.0
> >     >>
> >     >> > ## 2. -------------------------
> >     >> > m + 2:3 ## gave numeric(0), now remains matrix
identical to  m
> >     >> > Im + 2:3 ## gave integer(0), now remains matrix
identical to Im
> >     >> (integer)
> >     >>
> >     >> > m > 1      ## gave logical(0), now remains matrix
identical to
> Lm
> >     >> (logical)
> >     >> > m > 0.1[0] ##  ditto
> >     >> > m > NULL   ##  ditto
> >     >>
> >     >> > ## 3. -------------------------
> >     >> > mm <- m[,c(1:2,2:1,2)]
> >     >> > try( m == mm ) ## now gives error  
"non-conformable arrays",
> >     >> > ## but gave logical(0) in R <= 3.3.x
> >     >>
> >     >> > ## 4. -------------------------
> >     >> > str( Im + NULL)  ## gave "num", now gives
"int"
> >     >>
> >     >> > ## 5. -------------------------
> >     >> > ## special case for arithmetic w/ length-1 array
> >     >> > (m1 <- matrix(1,1,1,
dimnames=list("Ro","col")))
> >     >> > (m2 <- matrix(1,2,1,
dimnames=list(c("A","B"),"col")))
> >     >>
> >     >> > m1 + 1:2  # ->  2:3  but now with warning to 
"become ERROR"
> >     >> > tools::assertError(m1 & 1:2)# ERR: dims [product
1] do not match
> > the
> >     >> length of object [2]
> >     >> > tools::assertError(m1 < 1:2)# ERR:               
(ditto)
> >     >> > ##
> >     >> > ## non-0-length arrays combined with {NULL or
double() or ...}
> > *fail*
> >     >>
> >     >> > ### Length-1 arrays:  Arithmetic with |vectors| >
1  treated
> array
> >     >> as scalar
> >     >> > m1 + NULL # gave  numeric(0) in R <= 3.3.x ---
still, *but* w/
> >     >> warning to "be ERROR"
> >     >> > try(m1 > NULL)    # gave  logical(0) in R <=
3.3.x --- an
> *error*
> >     >> now in R >= 3.4.0
> >     >> > tools::assertError(m1 & NULL)    # gave and
gives error
> >     >> > tools::assertError(m1 | double())# ditto
> >     >> > ## m2 was slightly different:
> >     >> > tools::assertError(m2 + NULL)
> >     >> > tools::assertError(m2 & NULL)
> >     >> > try(m2 == NULL) ## was logical(0) in R <= 3.3.x;
now error as
> > above!
> >     >>
> >     >> >
------------------------------------------------------------
> >     >> ----------------
> >     >>
> >     >>
> >     >> > Note that in R's own  'nls'  sources,
there was one case of
> >     >> > situation '2)' above, i.e. a  1x1-matrix was
used as a "scalar".
> >     >>
> >     >> > In such cases, you should explicitly coerce it to a
vector,
> >     >> > either ("self-explainingly") by 
as.vector(.), or as I did in
> >     >> > the nls case  by  c(.) :  The latter is much less
> >     >> > self-explaining, but nicer to read in mathematical
formulae, and
> >     >> > currently also more efficient because it is a
.Primitive.
> >     >>
> >     >> > Please use R-devel with your code, and let us know
if you see
> >     >> > effects that seem adverse.
> >     >>
> >     >> I've been slightly surprised (or even
"frustrated") by the empty
> >     >> reaction on our R-devel list to this post.
> >     >>
> >     >> I would have expected some critique, may be even some
praise,
> >     >> ... in any case some sign people are "thinking
along" (as we say
> >     >> in German).
> >     >>
> >     >> In the mean time, I've actually thought along the one
case which
> >     >> is last above:  The <op>  (binary operation)
between a
> >     >> non-0-length array and a 0-length vector (and NULL which
should
> >     >> be treated like a 0-length vector):
> >     >>
> >     >> R <= 3.3.1  *is* quite inconsistent with these:
> >     >>
> >     >>
> >     >> and my proposal above (implemented in R-devel, since
Sep.5) would
> > give an
> >     >> error for all these, but instead, R really could be more
lenient
> > here:
> >     >> A 0-length result is ok, and it should *not* inherit the
array
> >     >> (dim, dimnames), since the array is not of length 0. So
instead
> >     >> of the above [for the very last part only!!], we would
aim for
> >     >> the following. These *all* give an error in current
R-devel,
> >     >> with the exception of 'm1 + NULL' which
"only" gives a "bad
> >     >> warning" :
> >     >>
> >     >> ------------------------
> >     >>
> >     >> m1 <- matrix(1,1)
> >     >> m2 <- matrix(1,2)
> >     >>
> >     >> m1 + NULL #    numeric(0) in R <= 3.3.x ---> OK ?!
> >     >> m1 > NULL #    logical(0) in R <= 3.3.x ---> OK
?!
> >     >> try(m1 & NULL)    # ERROR in R <= 3.3.x --->
change to logical(0)
> > ?!
> >     >> try(m1 | double())# ERROR in R <= 3.3.x ---> change
to logical(0)
> > ?!
> >     >> ## m2 slightly different:
> >     >> try(m2 + NULL)  # ERROR in R <= 3.3.x ---> change
to double(0)  ?!
> >     >> try(m2 & NULL)  # ERROR in R <= 3.3.x --->
change to logical(0)
> ?!
> >     >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK ?!
> >     >>
> >     >> ------------------------
> >     >>
> >     >> This would be slightly more back-compatible than the
currently
> >     >> implemented proposal. Everything else I said remains
true, and
> >     >> I'm pretty sure most changes needed in packages would
remain to be
> > done.
> >     >>
> >     >> Opinions ?
> >     >>
> >     >>
> >     >>
> >     >> > In some case where R-devel now gives an error but
did not
> >     >> > previously, we could contemplate giving another 
"warning
> >     >> > .... 'to become ERROR'" if there was
too much breakage,  though
> >     >> > I don't expect that.
> >     >>
> >     >>
> >     >> > For the R Core Team,
> >     >>
> >     >> > Martin Maechler,
> >     >> > ETH Zurich
> >     >>
> >     >> ______________________________________________
> >     >> R-devel at r-project.org mailing list
> >     >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >     >>
> >
> >
> >
> >     > --
> >     > Robin Hankin
> >     > Neutral theorist
> >     > hankin.robin at gmail.com
> >
> >     > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
>
> --
> Gabriel Becker, PhD
> Associate Scientist (Bioinformatics)
> Genentech Research
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

Gabriel Becker

2016-Sep-08 17:22 UTC

head link

[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap <wdunlap at tibco.com>
wrote:
> Shouldn't binary operators (arithmetic and logical) should throw an
error
> when one operand is NULL (or other type that doesn't make sense)?  This
is
> a different case than a zero-length operand of a legitimate type.  E.g.,
>      any(x < 0)
> should return FALSE if x is number-like and length(x)==0 but give an error
> if x is NULL.
>Bill,

That is a good point. I can see the argument for this in the case that the
non-zero length is 1. I'm not sure which is better though. If we switch
any() to all(), things get murky.

Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
all(x>0)), but the likelihood of this being a thought-bug on the author's
part is exceedingly high, imho. So the desirable behavior seems to depend
on the angle we look at it from.

My personal opinion is that x < y with length(x)==0 should fail if
length(y)> 1, at least, and I'd be for it being an error even if y is length 1,though I do acknowledge this is more likely (though still quite unlikely
imho) to be the intended behavior.

~G
>
> I.e., I think the type check should be done before the length check.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbecker at
ucdavis.edu>
> wrote:
>
>> Martin,
>>
>> Like Robin and Oliver I think this type of edge-case consistency is
>> important and that it's fantastic that R-core - and you personally
- are
>> willing to tackle some of these "gotcha" behaviors.
"Little" stuff like
>> this really does combine to go a long way to making R better and
better.
>>
>> I do wonder a  bit about the
>>
>> x = 1:2
>>
>> y = NULL
>>
>> x < y
>>
>> case.
>>
>> Returning a logical of length 0 is more backwards compatible, but is it
>> ever what the author actually intended? I have trouble thinking of a
case
>> where that less-than didn't carry an implicit assumption that y was
>> non-NULL.  I can say that in my own code, I've never hit that
behavior in
>> a
>> case that wasn't an error.
>>
>> My vote (unless someone else points out a compelling use for the
behavior)
>> is for the to throw an error. As a developer, I'd rather things
like this
>> break so the bug in my logic is visible, rather than  propagating as
the
>> 0-length logical is &'ed or |'ed with other logical
vectors, or used to
>> subset, or (in the case it should be length 1) passed to if() (if
throws
>> an
>> error now, but the rest would silently "work").
>>
>> Best,
>> ~G
>>
>> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
>> maechler at stat.math.ethz.ch>
>> wrote:
>>
>> > >>>>> robin hankin <hankin.robin at
gmail.com>
>> > >>>>>     on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>> >
>> >     > Martin I'd like to make a comment; I think that
R's
>> >     > behaviour on 'edge' cases like this is an
important thing
>> >     > and it's great that you are working on it.
>> >
>> >     > I make heavy use of zero-extent arrays, chiefly because
>> >     > the dimnames are an efficient and logical way to keep
>> >     > track of certain types of information.
>> >
>> >     > If I have, for example,
>> >
>> >     > a <- array(0,c(2,0,2))
>> >     > dimnames(a) <-
list(name=c('Mike','Kevin'),
>> > NULL,item=c("hat","scarf"))
>> >
>> >
>> >     > Then in R-3.3.1, 70800 I get
>> >
>> >     a> 0
>> >     > logical(0)
>> >     >>
>> >
>> >     > But in 71219 I get
>> >
>> >     a> 0
>> >     > , , item = hat
>> >
>> >
>> >     > name
>> >     > Mike
>> >     > Kevin
>> >
>> >     > , , item = scarf
>> >
>> >
>> >     > name
>> >     > Mike
>> >     > Kevin
>> >
>> >     > (which is an empty logical array that holds the names of
the
>> people
>> > and
>> >     > their clothes). I find the behaviour of 71219 very much
preferable
>> > because
>> >     > there is no reason to discard the information in the
dimnames.
>> >
>> > Thanks a lot, Robin, (and Oliver) !
>> >
>> > Yes, the above is such a case where the new behavior makes much
sense.
>> > And this behavior remains identical after the 71222 amendment.
>> >
>> > Martin
>> >
>> >     > Best wishes
>> >     > Robin
>> >
>> >
>> >
>> >
>> >     > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
>> > maechler at stat.math.ethz.ch>
>> >     > wrote:
>> >
>> >     >> >>>>> Martin Maechler <maechler at
stat.math.ethz.ch>
>> >     >> >>>>>     on Tue, 6 Sep 2016 22:26:31
+0200 writes:
>> >     >>
>> >     >> > Yesterday, changes to R's development
version were committed,
>> >     >> relating
>> >     >> > to arithmetic, logic ('&' and
'|') and
>> >     >> > comparison/relational ('<',
'==') binary operators
>> >     >> > which in NEWS are described as
>> >     >>
>> >     >> > SIGNIFICANT USER-VISIBLE CHANGES:
>> >     >>
>> >     >> > [.............]
>> >     >>
>> >     >> > ? Arithmetic, logic (?&?, ?|?) and
comparison (aka
>> >     >> > ?relational?, e.g., ?<?, ?==?) operations
with arrays now
>> >     >> > behave consistently, notably for arrays of
length zero.
>> >     >>
>> >     >> > Arithmetic between length-1 arrays and longer
non-arrays had
>> >     >> > silently dropped the array attributes and
recycled.  This
>> >     >> > now gives a warning and will signal an error in
the future,
>> >     >> > as it has always for logic and comparison
operations in
>> >     >> > these cases (e.g., compare ?matrix(1,1) + 2:3?
and
>> >     >> > ?matrix(1,1) < 2:3?).
>> >     >>
>> >     >> > As the above "visually suggests" one
could think of the changes
>> >     >> > falling mainly two groups,
>> >     >> > 1) <0-extent array>  (op)    
<non-array>
>> >     >> > 2) <1-extent array>  (arith) 
<non-array of length != 1>
>> >     >>
>> >     >> > These changes are partly non-back compatible and
may break
>> >     >> > existing code.  We believe that the internal
consistency gained
>> >     >> > from the changes is worth the few places with
problems.
>> >     >>
>> >     >> > We expect some package maintainers (10-20, or
even more?) need
>> >     >> > to adapt their code.
>> >     >>
>> >     >> > Case '2)' above mainly results in a new
warning, e.g.,
>> >     >>
>> >     >> >> matrix(1,1) + 1:2
>> >     >> > [1] 2 3
>> >     >> > Warning message:
>> >     >> > In matrix(1, 1) + 1:2 :
>> >     >> > dropping dim() of array of length one.  Will
become ERROR
>> >     >> >>
>> >     >>
>> >     >> > whereas '1)' gives errors in cases the
result silently was a
>> >     >> > vector of length zero, or also keeps array (dim
& dimnames) in
>> >     >> > cases these were silently dropped.
>> >     >>
>> >     >> > The following is a "heavily" commented
R script showing (all
>> ?)
>> >     >> > the important cases with changes :
>> >     >>
>> >     >> >
------------------------------------------------------------
>> >     >> ----------------
>> >     >>
>> >     >> > (m <- cbind(a=1[0], b=2[0]))
>> >     >> > Lm <- m; storage.mode(Lm) <-
"logical"
>> >     >> > Im <- m; storage.mode(Im) <-
"integer"
>> >     >>
>> >     >> > ## 1. -------------------------
>> >     >> > try( m & NULL ) # in R <= 3.3.x :
>> >     >> > ## Error in m & NULL :
>> >     >> > ##  operations are possible only for numeric,
logical or
>> complex
>> >     >> types
>> >     >> > ##
>> >     >> > ## gives 'Lm' in R >= 3.4.0
>> >     >>
>> >     >> > ## 2. -------------------------
>> >     >> > m + 2:3 ## gave numeric(0), now remains matrix
identical to  m
>> >     >> > Im + 2:3 ## gave integer(0), now remains matrix
identical to Im
>> >     >> (integer)
>> >     >>
>> >     >> > m > 1      ## gave logical(0), now remains
matrix identical to
>> Lm
>> >     >> (logical)
>> >     >> > m > 0.1[0] ##  ditto
>> >     >> > m > NULL   ##  ditto
>> >     >>
>> >     >> > ## 3. -------------------------
>> >     >> > mm <- m[,c(1:2,2:1,2)]
>> >     >> > try( m == mm ) ## now gives error  
"non-conformable arrays",
>> >     >> > ## but gave logical(0) in R <= 3.3.x
>> >     >>
>> >     >> > ## 4. -------------------------
>> >     >> > str( Im + NULL)  ## gave "num", now
gives "int"
>> >     >>
>> >     >> > ## 5. -------------------------
>> >     >> > ## special case for arithmetic w/ length-1 array
>> >     >> > (m1 <- matrix(1,1,1,
dimnames=list("Ro","col")))
>> >     >> > (m2 <- matrix(1,2,1,
dimnames=list(c("A","B"),"col")))
>> >     >>
>> >     >> > m1 + 1:2  # ->  2:3  but now with warning to 
"become ERROR"
>> >     >> > tools::assertError(m1 & 1:2)# ERR: dims
[product 1] do not
>> match
>> > the
>> >     >> length of object [2]
>> >     >> > tools::assertError(m1 < 1:2)# ERR:           
(ditto)
>> >     >> > ##
>> >     >> > ## non-0-length arrays combined with {NULL or
double() or ...}
>> > *fail*
>> >     >>
>> >     >> > ### Length-1 arrays:  Arithmetic with |vectors|
> 1  treated
>> array
>> >     >> as scalar
>> >     >> > m1 + NULL # gave  numeric(0) in R <= 3.3.x
--- still, *but* w/
>> >     >> warning to "be ERROR"
>> >     >> > try(m1 > NULL)    # gave  logical(0) in R
<= 3.3.x --- an
>> *error*
>> >     >> now in R >= 3.4.0
>> >     >> > tools::assertError(m1 & NULL)    # gave and
gives error
>> >     >> > tools::assertError(m1 | double())# ditto
>> >     >> > ## m2 was slightly different:
>> >     >> > tools::assertError(m2 + NULL)
>> >     >> > tools::assertError(m2 & NULL)
>> >     >> > try(m2 == NULL) ## was logical(0) in R <=
3.3.x; now error as
>> > above!
>> >     >>
>> >     >> >
------------------------------------------------------------
>> >     >> ----------------
>> >     >>
>> >     >>
>> >     >> > Note that in R's own  'nls' 
sources, there was one case of
>> >     >> > situation '2)' above, i.e. a  1x1-matrix
was used as a
>> "scalar".
>> >     >>
>> >     >> > In such cases, you should explicitly coerce it
to a vector,
>> >     >> > either ("self-explainingly") by 
as.vector(.), or as I did in
>> >     >> > the nls case  by  c(.) :  The latter is much
less
>> >     >> > self-explaining, but nicer to read in
mathematical formulae,
>> and
>> >     >> > currently also more efficient because it is a
.Primitive.
>> >     >>
>> >     >> > Please use R-devel with your code, and let us
know if you see
>> >     >> > effects that seem adverse.
>> >     >>
>> >     >> I've been slightly surprised (or even
"frustrated") by the empty
>> >     >> reaction on our R-devel list to this post.
>> >     >>
>> >     >> I would have expected some critique, may be even some
praise,
>> >     >> ... in any case some sign people are "thinking
along" (as we say
>> >     >> in German).
>> >     >>
>> >     >> In the mean time, I've actually thought along the
one case which
>> >     >> is last above:  The <op>  (binary operation)
between a
>> >     >> non-0-length array and a 0-length vector (and NULL
which should
>> >     >> be treated like a 0-length vector):
>> >     >>
>> >     >> R <= 3.3.1  *is* quite inconsistent with these:
>> >     >>
>> >     >>
>> >     >> and my proposal above (implemented in R-devel, since
Sep.5) would
>> > give an
>> >     >> error for all these, but instead, R really could be
more lenient
>> > here:
>> >     >> A 0-length result is ok, and it should *not* inherit
the array
>> >     >> (dim, dimnames), since the array is not of length 0.
So instead
>> >     >> of the above [for the very last part only!!], we
would aim for
>> >     >> the following. These *all* give an error in current
R-devel,
>> >     >> with the exception of 'm1 + NULL' which
"only" gives a "bad
>> >     >> warning" :
>> >     >>
>> >     >> ------------------------
>> >     >>
>> >     >> m1 <- matrix(1,1)
>> >     >> m2 <- matrix(1,2)
>> >     >>
>> >     >> m1 + NULL #    numeric(0) in R <= 3.3.x ---> OK
?!
>> >     >> m1 > NULL #    logical(0) in R <= 3.3.x --->
OK ?!
>> >     >> try(m1 & NULL)    # ERROR in R <= 3.3.x
---> change to logical(0)
>> > ?!
>> >     >> try(m1 | double())# ERROR in R <= 3.3.x --->
change to logical(0)
>> > ?!
>> >     >> ## m2 slightly different:
>> >     >> try(m2 + NULL)  # ERROR in R <= 3.3.x --->
change to double(0)
>> ?!
>> >     >> try(m2 & NULL)  # ERROR in R <= 3.3.x --->
change to logical(0)
>> ?!
>> >     >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK
?!
>> >     >>
>> >     >> ------------------------
>> >     >>
>> >     >> This would be slightly more back-compatible than the
currently
>> >     >> implemented proposal. Everything else I said remains
true, and
>> >     >> I'm pretty sure most changes needed in packages
would remain to
>> be
>> > done.
>> >     >>
>> >     >> Opinions ?
>> >     >>
>> >     >>
>> >     >>
>> >     >> > In some case where R-devel now gives an error
but did not
>> >     >> > previously, we could contemplate giving another 
"warning
>> >     >> > .... 'to become ERROR'" if there
was too much breakage,  though
>> >     >> > I don't expect that.
>> >     >>
>> >     >>
>> >     >> > For the R Core Team,
>> >     >>
>> >     >> > Martin Maechler,
>> >     >> > ETH Zurich
>> >     >>
>> >     >> ______________________________________________
>> >     >> R-devel at r-project.org mailing list
>> >     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >     >>
>> >
>> >
>> >
>> >     > --
>> >     > Robin Hankin
>> >     > Neutral theorist
>> >     > hankin.robin at gmail.com
>> >
>> >     > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>
>>
>> --
>> Gabriel Becker, PhD
>> Associate Scientist (Bioinformatics)
>> Genentech Research
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

-- 
Gabriel Becker, PhD
Associate Scientist (Bioinformatics)
Genentech Research

	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more possibly parallel threads

R devel - Sep 2016 - R (development) changes in arith, logic, relop with (0-extent) arrays

[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

Seemingly Similar Threads