Paul Miller
2012-Mar-17 13:24 UTC
[R] Coalesce function in BBmisc, emoa, and microbenchmark packages
Hello All, Need to coalesce some columns using R. Looked online to see how this is done. One approach appears to be to use ifelse. Also uncovered a coalesce function in the BBmisc, emoa, and microbenchmark packages. Trouble is I can't seem to get it to work in any of these packages. Or perhaps I misunderstand what it's intended to do. The documentation is generally pretty scant. Working with two columns: Date of Death (DOD) and Last Known Date Alive (LKDA). One or the other column is populated for each of the patients in my dataframe and the other column is blank. When I run code like "with(Demographics, coalesce(DOD, LKDA))", the function generates a value whenever DOD is not missing and generates NA otherwise (even though the value for LKDA is not missing). So, for example, I get an NA for the 8th element below, even though I have a value of "2008-03-25" for LKDA.> with(Demographics, coalesce(DOD, LKDA))[1] "2006-07-23" "2008-07-09" "2007-12-16" "2008-01-19" "2009-05-05" "2006-04-29" "2006-06-18" NA At least that's what happens in the BBmisc and emoa packages. The microbenchmark package appears not to have a coalesce function though the documentation says it does. I think I've seen instances where a function gets removed from a package. So maybe that's what happened here. Thought maybe there is a difference between blank and NA as far as R or the function is concerned. The is.na() function seems to indicate that a blank is an NA. I also tried making the blanks into NA but that didn't help. Does anyone have experience with the coalesce function in any of the three packages? If so, can they help me understand what I might be doing wrong? Thanks, Paul
R. Michael Weylandt
2012-Mar-17 13:55 UTC
[R] Coalesce function in BBmisc, emoa, and microbenchmark packages
I don't think any of these are doing what you want -- they just scan a list of arguments for the first non-null argument (with some small differences in implementation). They're more programming utilities than data analysis tools. It sounds like it's going to be easier to whip something up with ifelse(). microbenchmark has coalesce but it's not (readily) available for end user use specifically for that reason. Hope this helps, Michael On Sat, Mar 17, 2012 at 9:24 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:> Hello All, > > Need to coalesce some columns using R. Looked online to see how this is done. One approach appears to be to use ifelse. Also uncovered a coalesce function in the BBmisc, emoa, and microbenchmark packages. > > Trouble is I can't seem to get it to work in any of these packages. Or perhaps I misunderstand what it's intended to do. The documentation is generally pretty scant. > > Working with two columns: Date of Death (DOD) and Last Known Date Alive (LKDA). One or the other column is populated for each of the patients in my dataframe and the other column is blank. > > When I run code like "with(Demographics, coalesce(DOD, LKDA))", the function generates a value whenever DOD is not missing and generates NA otherwise (even though the value for LKDA is not missing). So, for example, I get an NA for the 8th element below, even though I have a value of "2008-03-25" for LKDA. > >> with(Demographics, coalesce(DOD, LKDA)) > [1] "2006-07-23" "2008-07-09" "2007-12-16" "2008-01-19" "2009-05-05" "2006-04-29" "2006-06-18" NA > > At least that's what happens in the BBmisc and emoa packages. The microbenchmark package appears not to have a coalesce function though the documentation says it does. I think I've seen instances where a function gets removed from a package. So maybe that's what happened here. > > Thought maybe there is a difference between blank and NA as far as R or the function is concerned. The is.na() function seems to indicate that a blank is an NA. I also tried making the blanks into NA but that didn't help. > > Does anyone have experience with the coalesce function in any of the three packages? If so, can they help me understand what I might be doing wrong? > > Thanks, > > Paul > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Brian Diggs
2012-Mar-19 21:24 UTC
[R] Coalesce function in BBmisc, emoa, and microbenchmark packages
On 3/17/2012 6:24 AM, Paul Miller wrote:> Hello All, > > Need to coalesce some columns using R. Looked online to see how this > is done. One approach appears to be to use ifelse. Also uncovered a > coalesce function in the BBmisc, emoa, and microbenchmark packages. > > Trouble is I can't seem to get it to work in any of these packages. > Or perhaps I misunderstand what it's intended to do. The > documentation is generally pretty scant. > > Working with two columns: Date of Death (DOD) and Last Known Date > Alive (LKDA). One or the other column is populated for each of the > patients in my dataframe and the other column is blank. > > When I run code like "with(Demographics, coalesce(DOD, LKDA))", the > function generates a value whenever DOD is not missing and generates > NA otherwise (even though the value for LKDA is not missing). So, for > example, I get an NA for the 8th element below, even though I have a > value of "2008-03-25" for LKDA. > >> with(Demographics, coalesce(DOD, LKDA)) > [1] "2006-07-23" "2008-07-09" "2007-12-16" "2008-01-19" "2009-05-05" > "2006-04-29" "2006-06-18" NA > > At least that's what happens in the BBmisc and emoa packages. The > microbenchmark package appears not to have a coalesce function though > the documentation says it does. I think I've seen instances where a > function gets removed from a package. So maybe that's what happened > here. > > Thought maybe there is a difference between blank and NA as far as R > or the function is concerned. The is.na() function seems to indicate > that a blank is an NA. I also tried making the blanks into NA but > that didn't help. > > Does anyone have experience with the coalesce function in any of the > three packages? If so, can they help me understand what I might be > doing wrong?I didn't know about these other coalesce functions, but I had written my own. Looking at them, they don't seem to be vectorized; mine is. That's not to say that there may not be other problems with it. ##' Return first non-NA, vectorized ##' ##' ##' @param ... Vectors, all of the same length. ##' @return Vector of the same length as the input vectors, each ##' element of which is the first corresponding non-NA element in the ##' given vectors in the order they are specified ##' @author Brian Diggs coalesce <- function(...) { dots <- list(...) ret <- Reduce(function (x,y) ifelse(!is.na(x),x,y), dots) class(ret) <- class(dots[[1]]) ret } And using your example data: Demographics <- data.frame(DOD = as.Date(c("2006-07-23", "2008-07-09", "2007-12-16", "2008-01-19", "2009-05-05", "2006-04-29", "2006-06-18", NA)), LKDA = as.Date(c(NA, NA, NA, NA, NA, NA, NA, "2008-03-25"))) > with(Demographics, coalesce(DOD, LKDA)) [1] "2006-07-23" "2008-07-09" "2007-12-16" "2008-01-19" "2009-05-05" [6] "2006-04-29" "2006-06-18" "2008-03-25"> Thanks, > > Paul >-- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University