Kelly Thompson
2022-Apr-06 20:13 UTC
[R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's?
I noticed that I get different results when subsetting using subset, compared to subsetting using "brackets" when the subset criteria have NA's. Here's an example #START OF EXAMPLE my_data <- 1:5 my_data my_subset_criteria <- c( F, F, T, NA, NA) my_subset_criteria #subsetting using subset returns the data where my_subset_criteria equals TRUE my_data[my_subset_criteria == T] #subsetting using brackets returns the data where my_subset_criteria equals TRUE, and also NA where my_subset_criteria is NA subset(my_data, my_subset_criteria == T) #END OF EXAMPLE This behavior is also mentioned here https://statisticaloddsandends.wordpress.com/2018/10/07/subsetting-in-the-presence-of-nas/ Q. Is this the intended behavior when subsetting with brackets? Thank you!
PIKAL Petr
2022-Apr-06 20:50 UTC
[R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's?
Hi safer way with NA values is using which my_data[which(my_subset_criteria == T)] [1] 3 Cheers Petr AFAIK it is intended. ________________________________________ Od: R-help <r-help-bounces at r-project.org> za u?ivatele Kelly Thompson <kt1572757 at gmail.com> Odesl?no: 6. dubna 2022 22:13 Komu: r-help at r-project.org P?edm?t: [R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's? I noticed that I get different results when subsetting using subset, compared to subsetting using "brackets" when the subset criteria have NA's. Here's an example #START OF EXAMPLE my_data <- 1:5 my_data my_subset_criteria <- c( F, F, T, NA, NA) my_subset_criteria #subsetting using subset returns the data where my_subset_criteria equals TRUE my_data[my_subset_criteria == T] #subsetting using brackets returns the data where my_subset_criteria equals TRUE, and also NA where my_subset_criteria is NA subset(my_data, my_subset_criteria == T) #END OF EXAMPLE This behavior is also mentioned here https://statisticaloddsandends.wordpress.com/2018/10/07/subsetting-in-the-presence-of-nas/ Q. Is this the intended behavior when subsetting with brackets? Thank you! ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner?s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
Ebert,Timothy Aaron
2022-Apr-06 20:52 UTC
[R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's?
I get an error with this: my_subset_criteria <- c( F, F, T, NA, NA) my_subset_criteria Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Kelly Thompson Sent: Wednesday, April 6, 2022 4:13 PM To: r-help at r-project.org Subject: [R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's? [External Email] I noticed that I get different results when subsetting using subset, compared to subsetting using "brackets" when the subset criteria have NA's. Here's an example #START OF EXAMPLE my_data <- 1:5 my_data my_subset_criteria <- c( F, F, T, NA, NA) my_subset_criteria #subsetting using subset returns the data where my_subset_criteria equals TRUE my_data[my_subset_criteria == T] #subsetting using brackets returns the data where my_subset_criteria equals TRUE, and also NA where my_subset_criteria is NA subset(my_data, my_subset_criteria == T) #END OF EXAMPLE This behavior is also mentioned here https://urldefense.proofpoint.com/v2/url?u=https-3A__statisticaloddsandends.wordpress.com_2018_10_07_subsetting-2Din-2Dthe-2Dpresence-2Dof-2Dnas_&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=NoPFkG0n9RFRaacmiiQ9Hp1cGniz9ED5YGN11-Jh6rD_zkTTE8e5egsKqzQDMSEW&s=5lgkxT5A_MSfElILNk1ZM3RGpcBWpMBu713av1DH1mk&e Q. Is this the intended behavior when subsetting with brackets? Thank you! ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=NoPFkG0n9RFRaacmiiQ9Hp1cGniz9ED5YGN11-Jh6rD_zkTTE8e5egsKqzQDMSEW&s=g9IzSC3WrXPLYjys_RdYSmgUoFFjsbwRJZZodqtDRa0&ePLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=NoPFkG0n9RFRaacmiiQ9Hp1cGniz9ED5YGN11-Jh6rD_zkTTE8e5egsKqzQDMSEW&s=uy6rCSNVehGynLn3ZCpLp_r2gHhoGcya4dbRe-tqQRc&eand provide commented, minimal, self-contained, reproducible code.
Marc Schwartz
2022-Apr-06 21:11 UTC
[R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's?
Hi, The behavior is as intended. Note that your subset criteria results in:> my_subset_criteria == T[1] FALSE FALSE ?TRUE ? ?NA ? ?NA If you review ?subset, you will see in the Details section: "For ordinary vectors, the result is simply x[subset & !is.na(subset)]." while reviewing ?"[", you will see in the section titled "NAs in indexing": "When extracting, a numerical, logical or character NA index picks an unknown element and so returns NA in the corresponding element of a logical, integer, numeric, complex or character result, and NULL for a list. (It returns 00 for a raw result.)" So, in the first case using subset(), NAs are explicitly excluded in the result, while not the case by default using bracket based subsetting. In essence, to replicate the behavior of subset() using brackets:> my_data[(my_subset_criteria == T) & !is.na(my_subset_criteria == T)][1] 3 Lastly, the use of 'T' as a single character representation of the boolean TRUE, is generally recommended against. While T and F are set as TRUE and FALSE at the start of a new R session, there is no guarantee that they will stay that way, as they can both be re-assigned:> T <- "This is not TRUE" > T[1] "This is not TRUE" whereas TRUE cannot be:> TRUE <- "This is not TRUE"Error in TRUE <- "This is not TRUE" :? ? invalid (do_set) left-hand side to assignment Regards, Marc Schwartz On April 6, 2022 at 4:13:01 PM, Kelly Thompson (kt1572757 at gmail.com (mailto:kt1572757 at gmail.com)) wrote:> I noticed that I get different results when subsetting using subset, > compared to subsetting using "brackets" when the subset criteria have > NA's. > > Here's an example > > #START OF EXAMPLE > my_data <- 1:5 > my_data > > my_subset_criteria <- c( F, F, T, NA, NA) > my_subset_criteria > > #subsetting using subset returns the data where my_subset_criteria equals TRUE > my_data[my_subset_criteria == T] > > #subsetting using brackets returns the data where my_subset_criteria > equals TRUE, and also NA where my_subset_criteria is NA > subset(my_data, my_subset_criteria == T) > > #END OF EXAMPLE > > This behavior is also mentioned here > https://statisticaloddsandends.wordpress.com/2018/10/07/subsetting-in-the-presence-of-nas/ > > Q. Is this the intended behavior when subsetting with brackets? > > Thank you! >
Fabio D'Agostino
2022-Apr-06 22:58 UTC
[R] What is the intended behavior, when subsetting using brackets [ ], when the subset criterion has NA's?
Hi Kelly, I had a question very similar to your months ago and Jeff replied this https://stat.ethz.ch/pipermail/r-help/2022-February/473861.html I hope this helps Fabio Il giorno mer 6 apr 2022 alle ore 22:23 Kelly Thompson <kt1572757 at gmail.com> ha scritto:> > I noticed that I get different results when subsetting using subset, > compared to subsetting using "brackets" when the subset criteria have > NA's. > > Here's an example > > #START OF EXAMPLE > my_data <- 1:5 > my_data > > my_subset_criteria <- c( F, F, T, NA, NA) > my_subset_criteria > > #subsetting using subset returns the data where my_subset_criteria equals TRUE > my_data[my_subset_criteria == T] > > #subsetting using brackets returns the data where my_subset_criteria > equals TRUE, and also NA where my_subset_criteria is NA > subset(my_data, my_subset_criteria == T) > > #END OF EXAMPLE > > This behavior is also mentioned here > https://statisticaloddsandends.wordpress.com/2018/10/07/subsetting-in-the-presence-of-nas/ > > Q. Is this the intended behavior when subsetting with brackets? > > Thank you! > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.