thr3ads.net - R help - [R] any and all [Apr 2024]

If this information is useful, please help other people find it:
Share via:

@vi@e@gross m@iii@g oii gm@ii@com

2024-Apr-12 19:52 UTC

[R] any and all

Base R has generic functions called any() and all() that I am having trouble
using.
 
It works fine when I play with it in a base R context as in:
 > all(any(TRUE, TRUE), any(TRUE, FALSE))
[1] TRUE> all(any(TRUE, TRUE), any(FALSE, FALSE))[1] FALSE
 
But in a tidyverse/dplyr environment, it returns wrong answers.
 
Consider this example. I have data I have joined together with pairs of
columns representing a first generation and several other pairs representing
additional generations. I want to consider any pair where at least one of
the pair is not NA as a success. But in order to keep the entire row, I want
all three pairs to have some valid data. This seems like a fairly common
reasonable thing often needed when evaluating data.
 
So to make it very general, I chose to do something a bit like this:
 
result <- filter(mydata,
                 all(
                   any(!is.na(first.a), !is.na(first.b)),
                   any(!is.na(second.a), !is.na(second.b)),
                   any(!is.na(third.a), !is.na(third.b))))
 
I apologize if the formatting is not seen properly. The above logically
should work. And it should be extendable to scenarios where you want at
least one of M columns to contain data as a group with N such groups of any
size.
 
But since it did not work, I tried a plan that did work and feels silly. I
used mutate() to make new columns such as:
 
result <-
  mydata |>
  mutate(
    usable.1 = (!is.na(first.a) | !is.na(first.b)),
    usable.2 = (!is.na(second.a) | !is.na(second.b)),
    usable.3 = (!is.na(third.a) | !is.na(third.b)),
    usable = (usable.1 & usable.2 & usable.3)
  ) |>
  filter(usable == TRUE)
 
The above wastes time and effort making new columns so I can check the
calculations then uses the combined columns to make a Boolean that can be
used to filter the result.
 
I know this is not the place to discuss dplyr. I want to check first if I am
doing anything wrong in how I use any/all. One guess is that the generic is
messed with by dplyr or other packages I libraried.
 
And, of course, some aspects of delayed evaluation can interfere in subtle
ways.
 
I note I have had other problems with these base R functions before and
generally solved them by not using them, as shown above. I would much rather
use them, or something similar.
 
 
Avi
 
 

	[[alternative HTML version deleted]]

Duncan Murdoch

2024-Apr-12 21:59 UTC

head link

[R] any and all

On 12/04/2024 3:52 p.m., avi.e.gross at gmail.com wrote:> Base R has generic functions called any() and all() that I am having
trouble
> using.
>   
> It works fine when I play with it in a base R context as in:
>   
>> all(any(TRUE, TRUE), any(TRUE, FALSE))
> [1] TRUE
>> all(any(TRUE, TRUE), any(FALSE, FALSE))
> [1] FALSE
>   
> But in a tidyverse/dplyr environment, it returns wrong answers.
>   
> Consider this example. I have data I have joined together with pairs of
> columns representing a first generation and several other pairs
representing
> additional generations. I want to consider any pair where at least one of
> the pair is not NA as a success. But in order to keep the entire row, I
want
> all three pairs to have some valid data. This seems like a fairly common
> reasonable thing often needed when evaluating data.
>   
> So to make it very general, I chose to do something a bit like this:
We can't really help you without a reproducible example.  It's not 
enough to show us something that doesn't run but is a bit like the real 
code.

Duncan Murdoch
>   
> result <- filter(mydata,
>                   all(
>                     any(!is.na(first.a), !is.na(first.b)),
>                     any(!is.na(second.a), !is.na(second.b)),
>                     any(!is.na(third.a), !is.na(third.b))))
>   
> I apologize if the formatting is not seen properly. The above logically
> should work. And it should be extendable to scenarios where you want at
> least one of M columns to contain data as a group with N such groups of any
> size.
>   
> But since it did not work, I tried a plan that did work and feels silly. I
> used mutate() to make new columns such as:
>   
> result <-
>    mydata |>
>    mutate(
>      usable.1 = (!is.na(first.a) | !is.na(first.b)),
>      usable.2 = (!is.na(second.a) | !is.na(second.b)),
>      usable.3 = (!is.na(third.a) | !is.na(third.b)),
>      usable = (usable.1 & usable.2 & usable.3)
>    ) |>
>    filter(usable == TRUE)
>   
> The above wastes time and effort making new columns so I can check the
> calculations then uses the combined columns to make a Boolean that can be
> used to filter the result.
>   
> I know this is not the place to discuss dplyr. I want to check first if I
am
> doing anything wrong in how I use any/all. One guess is that the generic is
> messed with by dplyr or other packages I libraried.
>   
> And, of course, some aspects of delayed evaluation can interfere in subtle
> ways.
>   
> I note I have had other problems with these base R functions before and
> generally solved them by not using them, as shown above. I would much
rather
> use them, or something similar.
>   
>   
> Avi
>   
>   
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Lennart Kasserra

2024-Apr-13 07:17 UTC

head link

[R] any and all

Hi Avi,


As D?nes T?th has rightly diagnosed, you are building an "all or 
nothing" filter. However, you do not need to explicitly spell out all 
columns that you want to filter for; the "tidy" way would be to use a 
helper function like `if_all()` or `if_any()`. Consider this example (I 
hope I understand your intentions correctly):

```

library(dplyr)


data <- tribble(
 ? ~first.a, ~first.b, ~first.c,
 ? 1L,????? ? 1L,?????? 0L,
 ? NA,?????? 1L,?????? 0L,
 ? 1L,??????? 0L,?????? NA,
 ? NA,?????? NA,?????? 1L
)

```

Let's say we only want to keep rows that have a non-missing value for 
either `first.a` or `first.b` (or hypothetical later generations like 
`second.a` and `second.b` etc.):

```

data |>
 ? filter(if_any(ends_with(c(".a", ".b")), \(x) !is.na(x)))

```

So: `filter()` (keep observations) `if_any` of the columns ending with 
.a or .b is not `NA` (we have to wrap `!is.na` into an anonymous 
function for it to be a valid argument type). This would yield

```

# A tibble: 3 ? 3
 ? first.a first.b first.c
 ??? <int>?? <int>?? <int>
1?????? 1?????? 1?????? 0
2????? NA?????? 1?????? 0
3?????? 1?????? 0????? NA

```

Discarding only the row where both of them are missing. Another way of 
writing this would be

```

data |>
 ? filter(!if_all(ends_with(c(".a", ".b")), is.na))

```

i.e. don't keep rows where all columns ending in .a or .b are `NA`, 
which returns the same result. Hope this helps,

Lennart Kasserra

Am 12.04.24 um 21:52 schrieb avi.e.gross at gmail.com:> Base R has generic functions called any() and all() that I am having
trouble
> using.
>   
> It works fine when I play with it in a base R context as in:
>   
>> all(any(TRUE, TRUE), any(TRUE, FALSE))
> [1] TRUE
>> all(any(TRUE, TRUE), any(FALSE, FALSE))
> [1] FALSE
>   
> But in a tidyverse/dplyr environment, it returns wrong answers.
>   
> Consider this example. I have data I have joined together with pairs of
> columns representing a first generation and several other pairs
representing
> additional generations. I want to consider any pair where at least one of
> the pair is not NA as a success. But in order to keep the entire row, I
want
> all three pairs to have some valid data. This seems like a fairly common
> reasonable thing often needed when evaluating data.
>   
> So to make it very general, I chose to do something a bit like this:
>   
> result <- filter(mydata,
>                   all(
>                     any(!is.na(first.a), !is.na(first.b)),
>                     any(!is.na(second.a), !is.na(second.b)),
>                     any(!is.na(third.a), !is.na(third.b))))
>   
> I apologize if the formatting is not seen properly. The above logically
> should work. And it should be extendable to scenarios where you want at
> least one of M columns to contain data as a group with N such groups of any
> size.
>   
> But since it did not work, I tried a plan that did work and feels silly. I
> used mutate() to make new columns such as:
>   
> result <-
>    mydata |>
>    mutate(
>      usable.1 = (!is.na(first.a) | !is.na(first.b)),
>      usable.2 = (!is.na(second.a) | !is.na(second.b)),
>      usable.3 = (!is.na(third.a) | !is.na(third.b)),
>      usable = (usable.1 & usable.2 & usable.3)
>    ) |>
>    filter(usable == TRUE)
>   
> The above wastes time and effort making new columns so I can check the
> calculations then uses the combined columns to make a Boolean that can be
> used to filter the result.
>   
> I know this is not the place to discuss dplyr. I want to check first if I
am
> doing anything wrong in how I use any/all. One guess is that the generic is
> messed with by dplyr or other packages I libraried.
>   
> And, of course, some aspects of delayed evaluation can interfere in subtle
> ways.
>   
> I note I have had other problems with these base R functions before and
> generally solved them by not using them, as shown above. I would much
rather
> use them, or something similar.
>   
>   
> Avi
>   
>   
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Seemingly Similar Threads

Search for more maybe matching threads

R help - Apr 2024 - any and all

[R] any and all

[R] any and all

[R] any and all

Seemingly Similar Threads