Dénes Tóth
2018-Aug-30 12:09 UTC
[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1
On 08/30/2018 01:56 PM, Joris Meys wrote:> I have to agree with Emil here. && and || are short circuited like in C and > C++. That means that > > TRUE || c(TRUE, FALSE) > FALSE && c(TRUE, FALSE) > > cannot give an error because the second part is never evaluated. Throwing a > warning or error for > > c(TRUE, FALSE) || TRUE > > would mean that the operator gives a different result depending on the > order of the objects, breaking the symmetry. Also that would be undesirable.Note that `||` and `&&` have never been symmetric: TRUE || stop() # returns TRUE stop() || TRUE # returns an error> > Regarding logical(0): per the documentation, it is indeed so that ||, && > and isTRUE always return a length-one logical vector. Hence the NA. > > On a sidenote: there is no such thing as a scalar in R. What you call > scalar, is really a length-one vector. That seems like a detail, but is > important in understanding why this admittedly confusing behaviour actually > makes sense within the framework of R imho. I do understand your objections > and suggestions, but it would boil down to removing short circuited > operators from R. > > My 2 cents. > Cheers > Joris > > On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson <henrik.bengtsson at gmail.com> > wrote: > >> # Issue >> >> 'x || y' performs 'x[1] || y' for length(x) > 1. For instance (here >> using R 3.5.1), >> >>> c(TRUE, TRUE) || FALSE >> [1] TRUE >>> c(TRUE, FALSE) || FALSE >> [1] TRUE >>> c(TRUE, NA) || FALSE >> [1] TRUE >>> c(FALSE, TRUE) || FALSE >> [1] FALSE >> >> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the >> same) and it also applies to 'x && y'. >> >> Note also how the above truncation of 'x' is completely silent - >> there's neither an error nor a warning being produced. >> >> >> # Discussion/Suggestion >> >> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a >> mistake. Either the code is written assuming 'x' and 'y' are scalars, >> or there is a coding error and vectorized versions 'x | y' and 'x & y' >> were intended. Should 'x || y' always be considered an mistake if >> 'length(x) != 1' or 'length(y) != 1'? If so, should it be a warning >> or an error? For instance, >> '''r >>> x <- c(TRUE, TRUE) >>> y <- FALSE >>> x || y >> >> Error in x || y : applying scalar operator || to non-scalar elements >> Execution halted >> >> What about the case where 'length(x) == 0' or 'length(y) == 0'? Today >> 'x || y' returns 'NA' in such cases, e.g. >> >>> logical(0) || c(FALSE, NA) >> [1] NA >>> logical(0) || logical(0) >> [1] NA >>> logical(0) && logical(0) >> [1] NA >> >> I don't know the background for this behavior, but I'm sure there is >> an argument behind that one. Maybe it's simply that '||' and '&&' >> should always return a scalar logical and neither TRUE nor FALSE can >> be returned. >> >> /Henrik >> >> PS. This is in the same vein as >> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html >> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if >> _R_CHECK_LENGTH_1_CONDITION_=true >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > >
Joris Meys
2018-Aug-30 12:48 UTC
[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1
On Thu, Aug 30, 2018 at 2:09 PM D?nes T?th <toth.denes at kogentum.hu> wrote:> Note that `||` and `&&` have never been symmetric: > > TRUE || stop() # returns TRUE > stop() || TRUE # returns an error > >Fair point. So the suggestion would be to check whether x is of length 1 and whether y is of length 1 only when needed. I.e. c(TRUE,FALSE) || TRUE would give an error and TRUE || c(TRUE, FALSE) would pass. Thought about it a bit more, and I can't come up with a use case where the first line must pass. So if the short circuiting remains and the extra check only gives a small performance penalty, adding the error could indeed make some bugs more obvious. Cheers Joris -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/ ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
Emil Bode
2018-Aug-30 14:01 UTC
[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1
Okay, I thought you always wanted to check the length, but if we can only check what's evaluated I mostly agree. I still think there's not much wrong with how length-0 logicals are treated, as the return of NA in cases where the value matters is enough warning I think, and I can imagine some code like my previous example 'x==-1 || length(x)==0', which wouldn't need a warning. But we could do a check for length being >1 Greetings, Emil ?On 30/08/2018, 14:55, "R-devel on behalf of Joris Meys" <r-devel-bounces at r-project.org on behalf of jorismeys at gmail.com> wrote: On Thu, Aug 30, 2018 at 2:09 PM D?nes T?th <toth.denes at kogentum.hu> wrote: > Note that `||` and `&&` have never been symmetric: > > TRUE || stop() # returns TRUE > stop() || TRUE # returns an error > > Fair point. So the suggestion would be to check whether x is of length 1 and whether y is of length 1 only when needed. I.e. c(TRUE,FALSE) || TRUE would give an error and TRUE || c(TRUE, FALSE) would pass. Thought about it a bit more, and I can't come up with a use case where the first line must pass. So if the short circuiting remains and the extra check only gives a small performance penalty, adding the error could indeed make some bugs more obvious. Cheers Joris -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/ ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Martin Maechler
2018-Aug-30 15:58 UTC
[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1
>>>>> Joris Meys >>>>> on Thu, 30 Aug 2018 14:48:01 +0200 writes:> On Thu, Aug 30, 2018 at 2:09 PM D?nes T?th > <toth.denes at kogentum.hu> wrote: >> Note that `||` and `&&` have never been symmetric: >> >> TRUE || stop() # returns TRUE stop() || TRUE # returns an >> error >> >> > Fair point. So the suggestion would be to check whether x > is of length 1 and whether y is of length 1 only when > needed. I.e. > c(TRUE,FALSE) || TRUE > would give an error and > TRUE || c(TRUE, FALSE) > would pass. > Thought about it a bit more, and I can't come up with a > use case where the first line must pass. So if the short > circuiting remains and the extra check only gives a small > performance penalty, adding the error could indeed make > some bugs more obvious. I agree "in theory". Thank you, Henrik, for bringing it up! In practice I think we should start having a warning signalled. I have checked the source code in the mean time, and the check is really very cheap { because it can/should be done after checking isNumber(): so then we know we have an atomic and can use XLENGTH() } The 0-length case I don't think we should change as I do find NA (is logical!) to be an appropriate logical answer. Martin Maechler ETH Zurich and R Core team. > Cheers Joris > -- > Joris Meys Statistical consultant
Maybe Matching Threads
- ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1
- download.file does not process gz files correctly (truncates them?)
- Date class shows Inf as NA; this confuses the use of is.na()
- Apparent bug in behavior of formulas with '-' operator for lm
- truncation/rounding bug with write.csv