Ian Strang
2011-Nov-20 20:38 UTC
[R] Adding two or more columns of a data frame for each row when NAs are present.
I am fairly new to R and would like help with the problem below. I am trying to sum and count several rows in the data frame yy below. All works well as in example 1. When I try to add the columns, with an NA in Q21, I get as NA as mySum. I would like NA to be treated as O, or igored. I wrote a function to try to count an NA element as 0, Example 3 function. It works with a few warnings, Example 4, but still gives NA instead of the addition when there is an NA in an element. In Example 6 & 7, I tried using sum() but it just sums the whole data frame, I think, How do I add together several columns giving the result for each row in mySum? NA should be treated as a 0. Please, note, I do not want to sum all the columns, as I think rowSums would do, just the selected ones. Thanks for your help. Ian, > yy <- read.table( header = T, sep=",", text = ## to create a data frame + "Q20, Q21, Q22, Q23, Q24 + 0,1, 2,3,4 + 1,NA,2,3,4 + 2,1, 2,3,4") + yy Q20 Q21 Q22 Q23 Q24 1 0 1 2 3 4 2 1 NA 2 3 4 3 2 1 2 3 4 > x <- transform( yy, ############## Example 1 + mySum = as.numeric(Q20) + as.numeric(Q22) + as.numeric(Q24), + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) + ) + x Q20 Q21 Q22 Q23 Q24 mySum myCount 1 0 1 2 3 4 6 3 2 1 NA 2 3 4 7 2 3 2 1 2 3 4 8 3 > + x <- transform( yy, ################ Example 2 + mySum = as.numeric(Q20) + as.numeric(Q21) + as.numeric(Q24), + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) + ) + x Q20 Q21 Q22 Q23 Q24 mySum myCount 1 0 1 2 3 4 5 3 2 1 NA 2 3 4 NA 2 3 2 1 2 3 4 7 3 > NifAvail <- function(x) { if (is.na(x)) x<-0 else x <- x ############### Example 3 + return(as.numeric(x)) + } #end function + NifAvail(5) [1] 5 + NifAvail(NA) [1] 0 > x <- transform( yy, + mySum = NifAvail(Q20) + NifAvail(Q22) + NifAvail(Q24), ############### Example 4 + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) + ) Warning messages: 1: In if (is.na(x)) x <- 0 else x <- x : the condition has length > 1 and only the first element will be used 2: In if (is.na(x)) x <- 0 else x <- x : the condition has length > 1 and only the first element will be used 3: In if (is.na(x)) x <- 0 else x <- x : the condition has length > 1 and only the first element will be used > x Q20 Q21 Q22 Q23 Q24 mySum myCount 1 0 1 2 3 4 6 3 2 1 NA 2 3 4 7 2 3 2 1 2 3 4 8 3 > x <- transform( yy, + mySum = NifAvail(Q20) + NifAvail(Q21) + NifAvail(Q24), ################ Example 5 + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) + ) Warning messages: 1: In if (is.na(x)) x <- 0 else x <- x : the condition has length > 1 and only the first element will be used 2: In if (is.na(x)) x <- 0 else x <- x : the condition has length > 1 and only the first element will be used 3: In if (is.na(x)) x <- 0 else x <- x : the condition has length > 1 and only the first element will be used > x Q20 Q21 Q22 Q23 Q24 mySum myCount 1 0 1 2 3 4 5 3 2 1 NA 2 3 4 NA 2 3 2 1 2 3 4 7 3 > x <- transform( yy, ############ Example 6 + mySum = sum(as.numeric(Q20), as.numeric(Q21), as.numeric(Q23), na.rm=T), + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) + ) + x Q20 Q21 Q22 Q23 Q24 mySum myCount 1 0 1 2 3 4 14 3 2 1 NA 2 3 4 14 2 3 2 1 2 3 4 14 3 > x <- transform( yy, ############# Example 7 + mySum = sum(as.numeric(Q20), as.numeric(Q22), as.numeric(Q23), na.rm=T), + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) + ) + x Q20 Q21 Q22 Q23 Q24 mySum myCount 1 0 1 2 3 4 18 3 2 1 NA 2 3 4 18 2 3 2 1 2 3 4 18 3
Dennis Murphy
2011-Nov-21 04:30 UTC
[R] Adding two or more columns of a data frame for each row when NAs are present.
Hi: Does this work for you?> yyQ20 Q21 Q22 Q23 Q24 1 0 1 2 3 4 2 1 NA 2 3 4 3 2 1 2 3 4 rowSums(yy, na.rm = TRUE) 1 2 3 10 10 12 # Use a subset of the variables in yy: selectVars <- paste('Q', c(20, 21, 24), sep = '') rowSums(yy[, selectVars], na.rm = TRUE) 1 2 3 5 5 7 HTH, Dennis On Sun, Nov 20, 2011 at 12:38 PM, Ian Strang <hamamelis at ntlworld.com> wrote:> > I am fairly new to R and would like help with the problem below. I am trying > to sum and count several rows in the data frame yy below. All works well as > in example 1. When I try to add the columns, with an NA in Q21, I get as NA > as mySum. I would like NA to be treated as O, or igored. > I wrote a function to try to count an NA element as 0, Example 3 function. > It works with a few warnings, Example 4, but still gives NA instead of the > addition when there is an NA in an element. > > In Example 6 & 7, I tried using sum() but it just sums the whole data frame, > I think, > > How do I add together several columns giving the result for each row in > mySum? NA should be treated as a 0. Please, note, I do not want to sum all > the columns, as I think rowSums would do, just the selected ones. > > Thanks for your help. > Ian, > >> yy <- read.table( header = T, sep=",", text = ? ? ## to create a data >> frame > + "Q20, Q21, Q22, Q23, Q24 > + ?0,1, 2,3,4 > + ?1,NA,2,3,4 > + ?2,1, 2,3,4") > + ?yy > ?Q20 Q21 Q22 Q23 Q24 > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 > 2 ? 1 ?NA ? 2 ? 3 ? 4 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 > >> x <- transform( yy, ? ? ############## Example 1 > + ? mySum = as.numeric(Q20) + as.numeric(Q22) + as.numeric(Q24), > + ? myCount > as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) > + ) > + x > ?Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 ? ? 6 ? ? ? 3 > 2 ? 1 ?NA ? 2 ? 3 ? 4 ? ? 7 ? ? ? 2 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 ? ? 8 ? ? ? 3 >> > + x <- transform( yy, ? ? ################ Example 2 > + ? mySum = as.numeric(Q20) + as.numeric(Q21) + as.numeric(Q24), > + ? myCount > as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) > + ) > + x > ?Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 ? ? 5 ? ? ? 3 > 2 ? 1 ?NA ? 2 ? 3 ? 4 ? ?NA ? ? ? 2 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 ? ? 7 ? ? ? 3 > >> NifAvail <- function(x) { if (is.na(x)) x<-0 else x <- x ? ############### >> Example 3 > + ? return(as.numeric(x)) > + } #end function > + NifAvail(5) > [1] 5 > + NifAvail(NA) > [1] 0 > >> x <- transform( yy, > + ? mySum = NifAvail(Q20) + NifAvail(Q22) + NifAvail(Q24), > ?############### Example 4 > + ? myCount > as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) > + ) > Warning messages: > 1: In if (is.na(x)) x <- 0 else x <- x : > ?the condition has length > 1 and only the first element will be used > 2: In if (is.na(x)) x <- 0 else x <- x : > ?the condition has length > 1 and only the first element will be used > 3: In if (is.na(x)) x <- 0 else x <- x : > ?the condition has length > 1 and only the first element will be used >> x > ?Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 ? ? 6 ? ? ? 3 > 2 ? 1 ?NA ? 2 ? 3 ? 4 ? ? 7 ? ? ? 2 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 ? ? 8 ? ? ? 3 >> x <- transform( yy, > + ? mySum = NifAvail(Q20) + NifAvail(Q21) + NifAvail(Q24), > ################ Example 5 > + ? myCount > as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) > + ) > Warning messages: > 1: In if (is.na(x)) x <- 0 else x <- x : > ?the condition has length > 1 and only the first element will be used > 2: In if (is.na(x)) x <- 0 else x <- x : > ?the condition has length > 1 and only the first element will be used > 3: In if (is.na(x)) x <- 0 else x <- x : > ?the condition has length > 1 and only the first element will be used >> x > ?Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 ? ? 5 ? ? ? 3 > 2 ? 1 ?NA ? 2 ? 3 ? 4 ? ?NA ? ? ? 2 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 ? ? 7 ? ? ? 3 > > >> x <- transform( yy, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?############ >> Example 6 > + ? mySum = sum(as.numeric(Q20), as.numeric(Q21), as.numeric(Q23), na.rm=T), > + ? myCount > as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) > + ) > + x > ?Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 ? ?14 ? ? ? 3 > 2 ? 1 ?NA ? 2 ? 3 ? 4 ? ?14 ? ? ? 2 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 ? ?14 ? ? ? 3 > >> x <- transform( yy, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ############# >> Example 7 > + ? mySum = sum(as.numeric(Q20), as.numeric(Q22), as.numeric(Q23), na.rm=T), > + ? myCount > as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24)) > + ) > + x > ?Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 ? 0 ? 1 ? ?2 ? 3 ? 4 ? ?18 ? ? ? 3 > 2 ? 1 ?NA ? 2 ? 3 ? 4 ? ?18 ? ? ? 2 > 3 ? 2 ? 1 ? ?2 ? 3 ? 4 ? ?18 ? ? ? 3 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
David Winsemius
2011-Nov-21 05:03 UTC
[R] Adding two or more columns of a data frame for each row when NAs are present.
On Nov 20, 2011, at 3:38 PM, Ian Strang wrote:> > I am fairly new to R and would like help with the problem below. I > am trying to sum and count several rows in the data frame yy below. > All works well as in example 1. When I try to add the columns, with > an NA in Q21, I get as NA as mySum. I would like NA to be treated as > O, or igored."Ignored" is by far the better option and that is easily accomplished by reading the help page for 'sum' and using the obvious parameter settings. ?sum> I wrote a function to try to count an NA element as 0, Example 3 > function. It works with a few warnings, Example 4, but still gives > NA instead of the addition when there is an NA in an element. > > In Example 6 & 7, I tried using sum() but it just sums the whole > data frame, I think,It sums whatever you give it.> > How do I add together several columns giving the result for each row > in mySum??rowSums # which also has the same parameter setting for dealing with NAs.> NA should be treated as a 0.Nooo , noooo, nooooooooo. If it's missing it's not 0.> Please, note, I do not want to sum all the columns, as I think > rowSums would do, just the selected ones.Fine. then select them: ?[" -- David.> > Thanks for your help. > Ian, > > > yy <- read.table( header = T, sep=",", text = ## to create a > data frame > + "Q20, Q21, Q22, Q23, Q24 > + 0,1, 2,3,4 > + 1,NA,2,3,4 > + 2,1, 2,3,4") > + yy > Q20 Q21 Q22 Q23 Q24 > 1 0 1 2 3 4 > 2 1 NA 2 3 4 > 3 2 1 2 3 4 > > > x <- transform( yy, ############## Example 1 > + mySum = as.numeric(Q20) + as.numeric(Q22) + as.numeric(Q24), > + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) > +as.numeric(!is.na(Q24)) > + ) > + x > Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 0 1 2 3 4 6 3 > 2 1 NA 2 3 4 7 2 > 3 2 1 2 3 4 8 3 > > > + x <- transform( yy, ################ Example 2 > + mySum = as.numeric(Q20) + as.numeric(Q21) + as.numeric(Q24), > + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) > +as.numeric(!is.na(Q24)) > + ) > + x > Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 0 1 2 3 4 5 3 > 2 1 NA 2 3 4 NA 2 > 3 2 1 2 3 4 7 3 > > > NifAvail <- function(x) { if (is.na(x)) x<-0 else x <- x > ############### Example 3 > + return(as.numeric(x)) > + } #end function > + NifAvail(5) > [1] 5 > + NifAvail(NA) > [1] 0 > > > x <- transform( yy, > + mySum = NifAvail(Q20) + NifAvail(Q22) + NifAvail(Q24), > ############### Example 4 > + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) > +as.numeric(!is.na(Q24)) > + ) > Warning messages: > 1: In if (is.na(x)) x <- 0 else x <- x : > the condition has length > 1 and only the first element will be used > 2: In if (is.na(x)) x <- 0 else x <- x : > the condition has length > 1 and only the first element will be used > 3: In if (is.na(x)) x <- 0 else x <- x : > the condition has length > 1 and only the first element will be used > > x > Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 0 1 2 3 4 6 3 > 2 1 NA 2 3 4 7 2 > 3 2 1 2 3 4 8 3 > > x <- transform( yy, > + mySum = NifAvail(Q20) + NifAvail(Q21) + NifAvail(Q24), > ################ Example 5 > + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) > +as.numeric(!is.na(Q24)) > + ) > Warning messages: > 1: In if (is.na(x)) x <- 0 else x <- x : > the condition has length > 1 and only the first element will be used > 2: In if (is.na(x)) x <- 0 else x <- x : > the condition has length > 1 and only the first element will be used > 3: In if (is.na(x)) x <- 0 else x <- x : > the condition has length > 1 and only the first element will be used > > x > Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 0 1 2 3 4 5 3 > 2 1 NA 2 3 4 NA 2 > 3 2 1 2 3 4 7 3 > > > > x <- transform( yy, > ############ Example 6 > + mySum = sum(as.numeric(Q20), as.numeric(Q21), as.numeric(Q23), > na.rm=T), > + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) > +as.numeric(!is.na(Q24)) > + ) > + x > Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 0 1 2 3 4 14 3 > 2 1 NA 2 3 4 14 2 > 3 2 1 2 3 4 14 3 > > > x <- transform( yy, > ############# Example 7 > + mySum = sum(as.numeric(Q20), as.numeric(Q22), as.numeric(Q23), > na.rm=T), > + myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) > +as.numeric(!is.na(Q24)) > + ) > + x > Q20 Q21 Q22 Q23 Q24 mySum myCount > 1 0 1 2 3 4 18 3 > 2 1 NA 2 3 4 18 2 > 3 2 1 2 3 4 18 3 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT