Marc Schwartz
2007-Jun-20 15:53 UTC
[Rd] Expected behavior from: all(c(NA, NA, NA) < NA, na.rm = TRUE)?
Hi all, Came across this curious behavior in: R version 2.5.0 Patched (2007-06-05 r41831) A simplified example is:> all(c(NA, NA, NA) > NA, na.rm = TRUE)[1] TRUE Is this expected by definition? If one reduces this to individual comparisons, such as :> NA > NA[1] NA> all(NA > NA)[1] NA> all(NA > NA, na.rm = TRUE)[1] TRUE the initial comparison on the 3 element vector would be consistent with the last example. If one evaluates each side of the comparison within the parens in the initial example, you get something along the lines of the following: x <- c(NA, NA, NA) x <- x[!is.na(x)] # remove NA's (eg. mean.default(x, na.rm = TRUE))> xlogical(0)> logical(0) > NAlogical(0)> all(logical(0))[1] TRUE If my train of thought is correct, it seems to me that the behavior above distills down to the comparison between logical(0) and NA, which rather than returning NA, returns logical(0). This would seem appropriate, given that there is no actual comparison being made with NA, I think, since logical(0) is an 'empty' vector. However, should all(logical(0)) return TRUE or logical(0)? For example:> logical(0) == logical(0)logical(0)> all(logical(0) == logical(0))[1] TRUE If the initial comparison of logical(0) returns logical(0), which is not TRUE:> logical(0) == TRUElogical(0) then why does all() return TRUE, if the individual comparison is not TRUE? By definition from ?all: Given a sequence of logical arguments, a logical value indicating whether or not all of the elements of x are TRUE. The value returned is TRUE if all of the values in x are TRUE, and FALSE if any of the values in x are FALSE. If na.rm = FALSE and x consists of a mix of TRUE and NA values, the value is NA. Does this make any sense? Thanks, Marc Schwartz
Peter Dalgaard
2007-Jun-20 16:37 UTC
[Rd] Expected behavior from: all(c(NA, NA, NA) < NA, na.rm = TRUE)?
Marc Schwartz wrote:> Hi all, > > Came across this curious behavior in: > > R version 2.5.0 Patched (2007-06-05 r41831) > > > A simplified example is: > > >> all(c(NA, NA, NA) > NA, na.rm = TRUE) >> > [1] TRUE > > > Is this expected by definition? > > If one reduces this to individual comparisons, such as : > > >> NA > NA >> > [1] NA > > >> all(NA > NA) >> > [1] NA > > >> all(NA > NA, na.rm = TRUE) >> > [1] TRUE > > the initial comparison on the 3 element vector would be consistent with > the last example. > > If one evaluates each side of the comparison within the parens in the > initial example, you get something along the lines of the following: > > x <- c(NA, NA, NA) > x <- x[!is.na(x)] # remove NA's (eg. mean.default(x, na.rm = TRUE)) > > >> x >> > logical(0) > > >> logical(0) > NA >> > logical(0) > > >> all(logical(0)) >> > [1] TRUE > > > If my train of thought is correct, it seems to me that the behavior > above distills down to the comparison between logical(0) and NA, which > rather than returning NA, returns logical(0). > > This would seem appropriate, given that there is no actual comparison > being made with NA, I think, since logical(0) is an 'empty' vector. > > However, should all(logical(0)) return TRUE or logical(0)? For example: > > >> logical(0) == logical(0) >> > logical(0) > > >> all(logical(0) == logical(0)) >> > [1] TRUE > > > If the initial comparison of logical(0) returns logical(0), which is not > TRUE: > > >> logical(0) == TRUE >> > logical(0) > > then why does all() return TRUE, if the individual comparison is not > TRUE? By definition from ?all: > > Given a sequence of logical arguments, a logical value indicating > whether or not all of the elements of x are TRUE. > The value returned is TRUE if all of the values in x are TRUE, and FALSE > if any of the values in x are FALSE. > If na.rm = FALSE and x consists of a mix of TRUE and NA values, the > value is NA. > > > > Does this make any sense? >I don't see the problem. Isn't it just that all(logical(0))==TRUE by convention just like prod(numeric(0))==1 etc.?
Thomas Lumley
2007-Jun-20 16:48 UTC
[Rd] Expected behavior from: all(c(NA, NA, NA) < NA, na.rm = TRUE)?
On Wed, 20 Jun 2007, Marc Schwartz wrote:> > If my train of thought is correct, it seems to me that the behavior > above distills down to the comparison between logical(0) and NA, which > rather than returning NA, returns logical(0). > > This would seem appropriate, given that there is no actual comparison > being made with NA, I think, since logical(0) is an 'empty' vector. > > However, should all(logical(0)) return TRUE or logical(0)? For example: > >> logical(0) == logical(0) > logical(0) > >> all(logical(0) == logical(0)) > [1] TRUEYes.> > If the initial comparison of logical(0) returns logical(0), which is not > TRUE: > >> logical(0) == TRUE > logical(0)Yes, they have different lengths, so they aren't equal.> then why does all() return TRUE, if the individual comparison is not > TRUE? By definition from ?all: > > Given a sequence of logical arguments, a logical value indicating > whether or not all of the elements of x are TRUE.This is the empty set question that should probably be a FAQ. All elements of logical(0) are TRUE, in the vacuous sense that it has no elements. The same sort of thing happens for any(logical(0)), which is FALSE; sum(numeric(0)), which is 0; prod(numeric(0)), which is 1; max(numeric(0)),which is -Inf; and min(numeric(0)), which is Inf. This seems as though R is trying to be difficult, but there is a real benefit in terms of associativity: all(all(x),all(y)) is always the same as all(x,y) under this definition. prod(prod(x), prod(y)) is prod(x,y) min(min(x),min(y)) is min(x,y) and so on. The general principle is that a function made by 'reducing' a vector with an associative binary operator, when applied to an empty vector, gives the identity element for the operator. The identity element for AND is TRUE.> > Does this make any sense? >Yes, although it is initially surprising. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Maybe Matching Threads
- [Kurt.Hornik@wu-wien.ac.at: Re: range( <dates>, na.rm = TRUE )] (PR#10508)
- range( <dates>, na.rm = TRUE ) (PR#10508)
- sum() returns NA on a long *logical* vector when nb of TRUE values exceeds 2^31
- sum() returns NA on a long *logical* vector when nb of TRUE values exceeds 2^31
- sum() returns NA on a long *logical* vector when nb of TRUE values exceeds 2^31