Martin Maechler
2015-Jan-26 12:45 UTC
[R] Sum function and missing values --- need to mimic SAS sum function
>>>>> Jim Lemon <drjimlemon at gmail.com> >>>>> on Mon, 26 Jan 2015 11:21:03 +1100 writes:> Hi Allen, How about this: > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE)) Excuse, Jim, but that's yet another "horrible misuse of ifelse()" John Fox's reply *did* contain the "proper" solution if (all(is.na(x))) NA else sum(x, na.rm=TRUE) The ifelse() function should never be used in such cases. Read more after googling "Do NOT use ifelse()" -- include the quotes in your search -- or directly at http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html Yes, this has been on R-help a month ago.. Martin > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham > <aebingham2 at gmail.com> wrote: >> I understand that in order to get the sum function to >> ignore missing values I need to supply the argument >> na.rm=TRUE. However, when summing numeric values in which >> ALL components are "NA" ... the result is 0.0 ... instead >> of (what I would get from SAS) of NA (or in the case of >> SAS "."). >> >> Accordingly, I've had to go to 'extreme' measures to get >> the sum function to result in NA if all arguments are >> missing (otherwise give me a sum of all non-NA elements). >> >> So for example here's a snippet of code that ALMOST does >> what I want: >> >> >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2), >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE) >> >> In reality this does NOT give me records with NA for >> SumValue ... but it doesn't give me values for any >> records in which both Variable.1 and Variable.2 are NA >> --- which is "good enough" for my purposes. >> >> I'm guessing with a little more work I could come up with >> a way to adapt the code above so that I could get it to >> work like SAS's sum function ... >> >> ... but before I go that extra mile I thought I'd ask >> others if they know of functions in either base R ... or >> in a package that will better mimic the SAS sum function. >> >> Any suggestions? >> >> Thanks. ______________________________________ Allen >> Bingham aebingham2 at gmail.com >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> more, see https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide >> commented, minimal, self-contained, reproducible code. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code.
Sven E. Templer
2015-Jan-26 14:56 UTC
[R] Sum function and missing values --- need to mimic SAS sum function
you can also define 'na.rm' in sum() by 'NA state' of x (where x is your vector holding the data): sum(x, na.rm=!all(is.na(x))) On 26 January 2015 at 13:45, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote:>>>>>> Jim Lemon <drjimlemon at gmail.com> >>>>>> on Mon, 26 Jan 2015 11:21:03 +1100 writes: > > > Hi Allen, How about this: > > > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE)) > > Excuse, Jim, but that's yet another "horrible misuse of ifelse()" > > John Fox's reply *did* contain the "proper" solution > > if (all(is.na(x))) NA else sum(x, na.rm=TRUE) > > The ifelse() function should never be used in such cases. > Read more after googling > > "Do NOT use ifelse()" > > -- include the quotes in your search -- > > or directly at > http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html > > Yes, this has been on R-help a month ago.. > Martin > > > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham > > <aebingham2 at gmail.com> wrote: > >> I understand that in order to get the sum function to > >> ignore missing values I need to supply the argument > >> na.rm=TRUE. However, when summing numeric values in which > >> ALL components are "NA" ... the result is 0.0 ... instead > >> of (what I would get from SAS) of NA (or in the case of > >> SAS "."). > >> > >> Accordingly, I've had to go to 'extreme' measures to get > >> the sum function to result in NA if all arguments are > >> missing (otherwise give me a sum of all non-NA elements). > >> > >> So for example here's a snippet of code that ALMOST does > >> what I want: > >> > >> > >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2), > >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE) > >> > >> In reality this does NOT give me records with NA for > >> SumValue ... but it doesn't give me values for any > >> records in which both Variable.1 and Variable.2 are NA > >> --- which is "good enough" for my purposes. > >> > >> I'm guessing with a little more work I could come up with > >> a way to adapt the code above so that I could get it to > >> work like SAS's sum function ... > >> > >> ... but before I go that extra mile I thought I'd ask > >> others if they know of functions in either base R ... or > >> in a package that will better mimic the SAS sum function. > >> > >> Any suggestions? > >> > >> Thanks. ______________________________________ Allen > >> Bingham aebingham2 at gmail.com > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and > >> more, see https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html and provide > >> commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and > > more, see https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html and provide > > commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Allen Bingham
2015-Jan-26 21:56 UTC
[R] Sum function and missing values --- need to mimic SAS sum function
Sven and John, Thanks for your suggested code ... hits the mark! The code by John is what I need to be able to use in an apply function, but I really like the simplicity of Sven's suggestion. Also thanks to all who replied --- really helped broaden my knowledge of R. Allen -----Original Message----- From: Sven E. Templer [mailto:sven.templer at gmail.com] Sent: Monday, January 26, 2015 6:56 AM To: Martin Maechler Cc: Jim Lemon; r-help mailing list; Allen Bingham Subject: Re: [R] Sum function and missing values --- need to mimic SAS sum function you can also define 'na.rm' in sum() by 'NA state' of x (where x is your vector holding the data): sum(x, na.rm=!all(is.na(x))) On 26 January 2015 at 13:45, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote:>>>>>> Jim Lemon <drjimlemon at gmail.com> >>>>>> on Mon, 26 Jan 2015 11:21:03 +1100 writes: > > > Hi Allen, How about this: > > > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE)) > > Excuse, Jim, but that's yet another "horrible misuse of ifelse()" > > John Fox's reply *did* contain the "proper" solution > > if (all(is.na(x))) NA else sum(x, na.rm=TRUE) > > The ifelse() function should never be used in such cases. > Read more after googling > > "Do NOT use ifelse()" > > -- include the quotes in your search -- > > or directly at > http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html > > Yes, this has been on R-help a month ago.. > Martin > > > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham > > <aebingham2 at gmail.com> wrote: > >> I understand that in order to get the sum function to > >> ignore missing values I need to supply the argument > >> na.rm=TRUE. However, when summing numeric values in which > >> ALL components are "NA" ... the result is 0.0 ... instead > >> of (what I would get from SAS) of NA (or in the case of > >> SAS "."). > >> > >> Accordingly, I've had to go to 'extreme' measures to get > >> the sum function to result in NA if all arguments are > >> missing (otherwise give me a sum of all non-NA elements). > >> > >> So for example here's a snippet of code that ALMOST does > >> what I want: > >> > >> > >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2), > >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE) > >> > >> In reality this does NOT give me records with NA for > >> SumValue ... but it doesn't give me values for any > >> records in which both Variable.1 and Variable.2 are NA > >> --- which is "good enough" for my purposes. > >> > >> I'm guessing with a little more work I could come up with > >> a way to adapt the code above so that I could get it to > >> work like SAS's sum function ... > >> > >> ... but before I go that extra mile I thought I'd ask > >> others if they know of functions in either base R ... or > >> in a package that will better mimic the SAS sum function. > >> > >> Any suggestions? > >> > >> Thanks. ______________________________________ Allen > >> Bingham aebingham2 at gmail.com > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and > >> more, see https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html and provide > >> commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and > > more, see https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html and provide > > commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hervé Pagès
2015-Jan-26 22:22 UTC
[R] Sum function and missing values --- need to mimic SAS sum function
Hi Martin, On 01/26/2015 04:45 AM, Martin Maechler wrote:>>>>>> Jim Lemon <drjimlemon at gmail.com> >>>>>> on Mon, 26 Jan 2015 11:21:03 +1100 writes: > > > Hi Allen, How about this: > > > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE)) > > Excuse, Jim, but that's yet another "horrible misuse of ifelse()" > > John Fox's reply *did* contain the "proper" solution > > if (all(is.na(x))) NA else sum(x, na.rm=TRUE) > > The ifelse() function should never be used in such cases. > Read more after googling > > "Do NOT use ifelse()" > > -- include the quotes in your search -- > > or directly at > http://stat.ethz.ch/pipermail/r-help/2014-December/424367.htmlInteresting. You could have added the following item to your list: 4. less likely to play strange tricks on you: > ifelse(TRUE, a <- 2L, a <- 3L) [1] 2 > a [1] 3 Yeah I've seen people using ifelse() that way and being totally confused... Cheers, H.> > Yes, this has been on R-help a month ago.. > Martin > > > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham > > <aebingham2 at gmail.com> wrote: > >> I understand that in order to get the sum function to > >> ignore missing values I need to supply the argument > >> na.rm=TRUE. However, when summing numeric values in which > >> ALL components are "NA" ... the result is 0.0 ... instead > >> of (what I would get from SAS) of NA (or in the case of > >> SAS "."). > >> > >> Accordingly, I've had to go to 'extreme' measures to get > >> the sum function to result in NA if all arguments are > >> missing (otherwise give me a sum of all non-NA elements). > >> > >> So for example here's a snippet of code that ALMOST does > >> what I want: > >> > >> > >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2), > >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE) > >> > >> In reality this does NOT give me records with NA for > >> SumValue ... but it doesn't give me values for any > >> records in which both Variable.1 and Variable.2 are NA > >> --- which is "good enough" for my purposes. > >> > >> I'm guessing with a little more work I could come up with > >> a way to adapt the code above so that I could get it to > >> work like SAS's sum function ... > >> > >> ... but before I go that extra mile I thought I'd ask > >> others if they know of functions in either base R ... or > >> in a package that will better mimic the SAS sum function. > >> > >> Any suggestions? > >> > >> Thanks. ______________________________________ Allen > >> Bingham aebingham2 at gmail.com > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and > >> more, see https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html and provide > >> commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and > > more, see https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html and provide > > commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
Boris Steipe
2015-Jan-27 00:42 UTC
[R] Sum function and missing values --- need to mimic SAS sum function
> sum(x, na.rm=!all(is.na(x)))That's the kind of idiom that brings the poor chap who has to maintain it to tears. ;-)
Bert Gunter
2015-Jan-27 10:54 UTC
[R] Sum function and missing values --- need to mimic SAS sum function
Huh??> ifelse(TRUE, a <- 2L, a <- 3L)[1] 2> a[1] 2 Please clarify. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Mon, Jan 26, 2015 at 2:22 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:> Hi Martin, > > On 01/26/2015 04:45 AM, Martin Maechler wrote: >>>>>>> >>>>>>> Jim Lemon <drjimlemon at gmail.com> >>>>>>> on Mon, 26 Jan 2015 11:21:03 +1100 writes: >> >> >> > Hi Allen, How about this: >> >> > sum_w_NA<-function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=TRUE)) >> >> Excuse, Jim, but that's yet another "horrible misuse of ifelse()" >> >> John Fox's reply *did* contain the "proper" solution >> >> if (all(is.na(x))) NA else sum(x, na.rm=TRUE) >> >> The ifelse() function should never be used in such cases. >> Read more after googling >> >> "Do NOT use ifelse()" >> >> -- include the quotes in your search -- >> >> or directly at >> http://stat.ethz.ch/pipermail/r-help/2014-December/424367.html > > > Interesting. You could have added the following item to your list: > > 4. less likely to play strange tricks on you: > > > ifelse(TRUE, a <- 2L, a <- 3L) > [1] 2 > > a > [1] 3 > > Yeah I've seen people using ifelse() that way and being totally > confused... > > Cheers, > H. > >> >> Yes, this has been on R-help a month ago.. >> Martin >> >> > On Mon, Jan 26, 2015 at 10:21 AM, Allen Bingham >> > <aebingham2 at gmail.com> wrote: >> >> I understand that in order to get the sum function to >> >> ignore missing values I need to supply the argument >> >> na.rm=TRUE. However, when summing numeric values in which >> >> ALL components are "NA" ... the result is 0.0 ... instead >> >> of (what I would get from SAS) of NA (or in the case of >> >> SAS "."). >> >> >> >> Accordingly, I've had to go to 'extreme' measures to get >> >> the sum function to result in NA if all arguments are >> >> missing (otherwise give me a sum of all non-NA elements). >> >> >> >> So for example here's a snippet of code that ALMOST does >> >> what I want: >> >> >> >> >> >> >> SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variable.2), >> >> select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE) >> >> >> >> In reality this does NOT give me records with NA for >> >> SumValue ... but it doesn't give me values for any >> >> records in which both Variable.1 and Variable.2 are NA >> >> --- which is "good enough" for my purposes. >> >> >> >> I'm guessing with a little more work I could come up with >> >> a way to adapt the code above so that I could get it to >> >> work like SAS's sum function ... >> >> >> >> ... but before I go that extra mile I thought I'd ask >> >> others if they know of functions in either base R ... or >> >> in a package that will better mimic the SAS sum function. >> >> >> >> Any suggestions? >> >> >> >> Thanks. ______________________________________ Allen >> >> Bingham aebingham2 at gmail.com >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> >> more, see https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html and provide >> >> commented, minimal, self-contained, reproducible code. >> >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> > more, see https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html and provide >> > commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Herv? Pag?s > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.