Dear R-help, I was bitten by the behavior of all() when given logical(0): It is TRUE! (And any(logical(0)) is FALSE.) Wouldn't it be better to return logical(0) in both cases? The problem surfaced because some un-named individual called randomForest(x, y, xtest, ytest,...), and gave y as a two-level factor, but ytest as just numeric vector. I thought I check for that in my code by testing for if (!all(levels(y) == levels(ytest))) stop(...) but levels() on a non-factor returns NULL, and the comparison ended up being logical(0). Since all(logical(0)) is TRUE, the error is not flagged. Best, Andy Andy Liaw, PhD Biometrics Research PO Box 2000, RY33-300 Merck Research Labs Rahway, NJ 07065 mailto:andy_liaw at merck.com 732-594-0820 ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
Andy Liaw wrote:> I was bitten by the behavior of all() when given logical(0): It is > TRUE! (And any(logical(0)) is FALSE.) Wouldn't it be better to > return logical(0) in both cases?It seems to me that what R does is strictly speaking correct. Anything you say about the members of the empty set is true. If a set of logical entities is empty, then it is correct to say that all members of that set are TRUE. (Because you cannot find a counter-example --- you cannot find a member of that set which isn't TRUE.) Likewise you can't find a member of that set which ***is*** TRUE so the answer to the question ``Are any of these TRUE?'' is ``No.'', i.e. ``any(logical(0))'' is FALSE. So returning logical(0) in these cases is not strictly correct; whether it would do any harm is not clear to me --- i.e. I can't think of an example where it would cause harm. And of course it would guard against the Trap For Young Players that Andy Liaw described. However since it is not strictly correct, great caution ought to be used. cheers, Rolf Turner rolf at math.unb.ca
I wrote:> I was bitten by the behavior of all() when given logical(0): > It is TRUE! > (And any(logical(0)) is FALSE.) Wouldn't it be better to > return logical(0) > in both cases?I guess the behavior is consistent with:> prod(numeric(0))[1] 1> sum(numeric(0))[1] 0 but why? Andy>
On Thu, 15 Apr 2004, Liaw, Andy wrote:> Dear R-help, > > I was bitten by the behavior of all() when given logical(0): It is TRUE! > (And any(logical(0)) is FALSE.) Wouldn't it be better to return logical(0) > in both cases?No, it wouldn't. The convention that "For all x in A: P(x)" is true whenever A is empty and that "There exists x in A: P(x)$ is false is very old. It is also useful that all(x) && all(y) is necessarily the same as all(x,y), which wouldn't be true under your suggestion. The basic principle is that elementwise operations give zero-length vectors for zero-length operands, but reducing operations that collapse a vector to a scalar give a suitable scalar. -thomas> The problem surfaced because some un-named individual called randomForest(x, > y, xtest, ytest,...), and gave y as a two-level factor, but ytest as just > numeric vector. I thought I check for that in my code by testing for > > if (!all(levels(y) == levels(ytest))) stop(...) > > but levels() on a non-factor returns NULL, and the comparison ended up being > logical(0). Since all(logical(0)) is TRUE, the error is not flagged. > > Best, > Andy > > Andy Liaw, PhD > Biometrics Research PO Box 2000, RY33-300 > Merck Research Labs Rahway, NJ 07065 > mailto:andy_liaw at merck.com 732-594-0820 > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachments,...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
On Thu, 15 Apr 2004 10:40:41 -0400, "Liaw, Andy" <andy_liaw at merck.com> wrote :>Dear R-help, > >I was bitten by the behavior of all() when given logical(0): It is TRUE! >(And any(logical(0)) is FALSE.) Wouldn't it be better to return logical(0) >in both cases?As Rolf said, this behaviour makes sense.>The problem surfaced because some un-named individual called randomForest(x, >y, xtest, ytest,...), and gave y as a two-level factor, but ytest as just >numeric vector. I thought I check for that in my code by testing for > >if (!all(levels(y) == levels(ytest))) stop(...)You should fix this by having y <- as.factor(y) and ytest <- as.factor(ytest) somewhere earlier in your code, since your code assumes that they are both factors, but doesn't enforce that in its input checks. Duncan Murdoch
Thanks to Rolf, Thomas, Duncan & Doug for the explanations! It's one of those things that I should have remembered from high school but clearly didn't... I've changed my code to: [If y is factor:] if (!is.null(ytest)) { if (!is.factor(ytest)) stop("ytest must be a factor") if (!all(levels(y) == levels(ytest))) stop("y and ytest must have the same levels") } For Duncan: `y' can be either factor or numeric. That's how randomForest determines whether it's a classification or regression problem. Thanks again to all! Andy> From: Liaw, Andy > > Dear R-help, > > I was bitten by the behavior of all() when given logical(0): > It is TRUE! > (And any(logical(0)) is FALSE.) Wouldn't it be better to > return logical(0) > in both cases? > > The problem surfaced because some un-named individual called > randomForest(x, > y, xtest, ytest,...), and gave y as a two-level factor, but > ytest as just > numeric vector. I thought I check for that in my code by testing for > > if (!all(levels(y) == levels(ytest))) stop(...) > > but levels() on a non-factor returns NULL, and the comparison > ended up being > logical(0). Since all(logical(0)) is TRUE, the error is not flagged. > > Best, > Andy > > Andy Liaw, PhD > Biometrics Research PO Box 2000, RY33-300 > Merck Research Labs Rahway, NJ 07065 > mailto:andy_liaw at merck.com 732-594-0820 > > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments, > contains information of Merck & Co., Inc. (One Merck Drive, > Whitehouse Station, New Jersey, USA 08889), and/or its > affiliates (which may be known outside the United States as > Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as > Banyu) that may be confidential, proprietary copyrighted > and/or legally privileged. It is intended solely for the use > of the individual or entity named on this message. If you > are not the intended recipient, and have received this > message in error, please notify us immediately by reply > e-mail and then delete it from your system. > -------------------------------------------------------------- > ---------------- >
Hello, I just downloaded RW1090. No problems. My thanks to everybody involved in the project. I work in Win98 I updated my library, some problems with some files that were in the PACKAGES list but not in 1.9/ site, now all are. I tried to install "Zelig" from Harvard install.packages("Zelig",CRAN="http://gking.harvard.edu") this worked in 1.8.1 but now it appends "bin/windows/contrib/1.9" to the address and of course it can not find the file and aborts with error 404 Is there a way around besides http'ing directly to harvard and getting the zip and unzipping it in /library? Thanks . Heberto Ghezzo Ph.D. McGill University Montreal - Que - Canada
"Liaw, Andy" <andy_liaw at merck.com> wrote: I was bitten by the behavior of all() when given logical(0): It is TRUE! (And any(logical(0)) is FALSE.) Wouldn't it be better to return logical(0) in both cases? It would be disastrous. For all integer n >= 0, all(integer(n) == integer(n)) => TRUE any(integer(n) != integer(n)) => FALSE Your proposal would give wrong answers for n == 0. For any simple array (who knows what an arbitrary object will do?) we expect all(x == x) => TRUE, any(x != x) => FALSE. If this were changed for empty x, we'd never be able to trust any() or all() again. Find a book about logic and read how bounded quantification (\forall x \in set) p(x) (\exists x \in set) p(x) is supposed to work when the set is empty.