Anthony Damico
2012-Jun-20 20:44 UTC
[R] binary operators that never return missing values
Hi, I work with data sets with lots of missing values. We often need to conduct logical tests on numeric vectors containing missing values. I've searched around for material and conversations on this topic, but I'm having a hard time finding anything. Has anyone written a package that deals with this sort of thing? All I want are a group of functions like the ones I've posted below, but I'm worried I'm re-inventing the wheel.. If they're not already on CRAN, I feel like I should add them. Any pointers to work already completed on this subject would be appreciated. Thanks! Anthony Damico Kaiser Family Foundation Here's a simple example of what I need done on a regular basis: #two numeric vectors a <- c( 1 , NA , 7 , 2 , NA ) b <- c( NA , NA , 9 , 1 , 6 ) #this has lots of NAs a > b #save this result in x x <- (a > b) #overwrite NAs in x with falses (which we do a lot) x <- ifelse( is.na( x ) , F , x ) #now x has only trues and falses x ################ Here's an example function that solves the problem for "greater than" ################ #construct a function that performs the same steps: "%>F%" <- function( a , b ){ x <- (a > b) x.false <- ifelse( is.na( x ) , F , x ) x.false } #run the function a %>F% b
I'm not sure I got the question but is this something like what you want? x[is.na(x) ] <- "FALSE" x John Kane Kingston ON Canada> -----Original Message----- > From: ajdamico at gmail.com > Sent: Wed, 20 Jun 2012 16:44:25 -0400 > To: r-help at r-project.org > Subject: [R] binary operators that never return missing values > > Hi, I work with data sets with lots of missing values. We often need > to conduct logical tests on numeric vectors containing missing values. > I've searched around for material and conversations on this topic, > but I'm having a hard time finding anything. Has anyone written a > package that deals with this sort of thing? All I want are a group of > functions like the ones I've posted below, but I'm worried I'm > re-inventing the wheel.. If they're not already on CRAN, I feel like > I should add them. Any pointers to work already completed on this > subject would be appreciated. Thanks! > > Anthony Damico > Kaiser Family Foundation > > > > Here's a simple example of what I need done on a regular basis: > > #two numeric vectors > a <- c( 1 , NA , 7 , 2 , NA ) > > b <- c( NA , NA , 9 , 1 , 6 ) > > #this has lots of NAs > a > b > > #save this result in x > x <- (a > b) > > #overwrite NAs in x with falses (which we do a lot) > x <- ifelse( is.na( x ) , F , x ) > > #now x has only trues and falses > x > > > > ################ > Here's an example function that solves the problem for "greater than" > ################ > > > #construct a function that performs the same steps: > "%>F%" <- > function( a , b ){ > > x <- (a > b) > > x.false <- ifelse( is.na( x ) , F , x ) > > x.false > > } > > #run the function > > a %>F% b > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails
R. Michael Weylandt
2012-Jun-20 20:54 UTC
[R] binary operators that never return missing values
Hi Anthony, No, I don't believe this exists on CRAN already (happy to be proven wrong though) but I might suggest you approach things a different way: instead of defining this operator by operator with infix notation, why not go after `+`, `>` directly? If you put a class on your vectors, you can define Ops.class which will change the behavior of all those sorts of things. Simple example (probably not complete nor necessarily advisable) a <- c( 1 , NA , 7 , 2 , NA ) b <- c( NA , NA , 9 , 1 , 6 ) class(a) <- class(b) <- "damico" Ops.damico <- function(e1, e2 = NULL){ e1[is.na(e1)] <- 0 e2[is.na(e2)] <- 0 NextMethod() } a < b More nuance is available, but this hopefully gives you a start. You might, e.g., think about setting this as something more like: Ops.damico <- function(e1, e2 = NULL){ if(.Generic %in% c("==","!=","<","<=",">=",">")){ e1[is.na(e1)] <- 0 e2[is.na(e2)] <- 0 } NextMethod() } so you don't mess up arithmetic but only the boolean comparisons. Best, Michael On Wed, Jun 20, 2012 at 3:44 PM, Anthony Damico <ajdamico at gmail.com> wrote:> Hi, I work with data sets with lots of missing values. ?We often need > to conduct logical tests on numeric vectors containing missing values. > ?I've searched around for material and conversations on this topic, > but I'm having a hard time finding anything. ?Has anyone written a > package that deals with this sort of thing? ?All I want are a group of > functions like the ones I've posted below, but I'm worried I'm > re-inventing the wheel.. ?If they're not already on CRAN, I feel like > I should add them. ?Any pointers to work already completed on this > subject would be appreciated. ?Thanks! > > Anthony Damico > Kaiser Family Foundation > > > > Here's a simple example of what I need done on a regular basis: > > #two numeric vectors > a <- c( 1 , NA , 7 , 2 , NA ) > > b <- c( NA , NA , 9 , 1 , 6 ) > > #this has lots of NAs > a > b > > #save this result in x > x <- (a > b) > > #overwrite NAs in x with falses (which we do a lot) > x <- ifelse( is.na( x ) , F , x ) > > #now x has only trues and falses > x > > > > ################ > Here's an example function that solves the problem for "greater than" > ################ > > > #construct a function that performs the same steps: > "%>F%" <- > ? ? ? ?function( a , b ){ > > ? ? ? ? ? ? ? ?x <- (a > b) > > ? ? ? ? ? ? ? ?x.false <- ifelse( is.na( x ) , F , x ) > > ? ? ? ? ? ? ? ?x.false > > ? ? ? ?} > > #run the function > > a %>F% b > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hello, again. I have two apologies, to the list for having forgotten to cc my previous reply to this thread, and to you for not having understood that you wanted it solved in one step. My solution would need two steps. Now revised. no.na <- function(x, value=FALSE){x[is.na(x)] <- value; x} no.na(a > b) x <- (a > b) x <- ifelse( is.na( x ) , F , x ) identical( x, no.na(a > b) ) [1] TRUE Hope this helps, Rui Barradas Em 20-06-2012 21:44, Anthony Damico escreveu:> Hi, I work with data sets with lots of missing values. We often need > to conduct logical tests on numeric vectors containing missing values. > I've searched around for material and conversations on this topic, > but I'm having a hard time finding anything. Has anyone written a > package that deals with this sort of thing? All I want are a group of > functions like the ones I've posted below, but I'm worried I'm > re-inventing the wheel.. If they're not already on CRAN, I feel like > I should add them. Any pointers to work already completed on this > subject would be appreciated. Thanks! > > Anthony Damico > Kaiser Family Foundation > > > > Here's a simple example of what I need done on a regular basis: > > #two numeric vectors > a <- c( 1 , NA , 7 , 2 , NA ) > > b <- c( NA , NA , 9 , 1 , 6 ) > > #this has lots of NAs > a > b > > #save this result in x > x <- (a > b) > > #overwrite NAs in x with falses (which we do a lot) > x <- ifelse( is.na( x ) , F , x ) > > #now x has only trues and falses > x > > > > ################ > Here's an example function that solves the problem for "greater than" > ################ > > > #construct a function that performs the same steps: > "%>F%" <- > function( a , b ){ > > x <- (a > b) > > x.false <- ifelse( is.na( x ) , F , x ) > > x.false > > } > > #run the function > > a %>F% b > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Duncan Murdoch
2012-Jun-20 23:04 UTC
[R] binary operators that never return missing values
On 12-06-20 4:44 PM, Anthony Damico wrote:> Hi, I work with data sets with lots of missing values. We often need > to conduct logical tests on numeric vectors containing missing values. > I've searched around for material and conversations on this topic, > but I'm having a hard time finding anything. Has anyone written a > package that deals with this sort of thing? All I want are a group of > functions like the ones I've posted below, but I'm worried I'm > re-inventing the wheel.. If they're not already on CRAN, I feel like > I should add them. Any pointers to work already completed on this > subject would be appreciated. Thanks! > > Anthony Damico > Kaiser Family Foundation > > > > Here's a simple example of what I need done on a regular basis: > > #two numeric vectors > a<- c( 1 , NA , 7 , 2 , NA ) > > b<- c( NA , NA , 9 , 1 , 6 ) > > #this has lots of NAs > a> b > > #save this result in x > x<- (a> b) > > #overwrite NAs in x with falses (which we do a lot) > x<- ifelse( is.na( x ) , F , x ) > > #now x has only trues and falses > xNot necessarily. F is a variable; if it happens to hold the value TRUE or 17, then x will get that. For your question: I think what you're doing is a bad idea. There are certain relations that hold for ">" that just don't hold for your function, e.g. (a > b) is the same as !(a <= b) (a > b) is the same as ( !(a < b) & (a != b) ) if !(a < b) and !(b < c), then !(a < c) etc. I think you'll find it very difficult to define the other comparison operators in a way that doesn't lead to strange behaviour when it violates these relations. Even if you never use any other comparisons, your reasoning about results will end up incorrect, because these relations are so ingrained into our psyches. It would probably be easier to get consistency if you treated NA as -Inf or +Inf, or just avoided the suggestive name: define foo(a,b) to return TRUE or FALSE according to your desired rules, and don't pretend it's an order relation. Duncan Murdoch