Michael.Laviolette at dhhs.state.nh.us
2015-Jun-22 13:46 UTC
[R] Omitting NA's using dcast (reshape2 package)
I'm using the "dcast" function from Hadley's "reshape2" package to do some tabulations. I can't get it to exclude NA's in the variables being tabulated. Here's a simple example. v1 <- c(rep("A", 5), rep("B", 5), NA) v2 <- c("X", "Y", "Y", "Z", "Z", "X", "Y", "Y", "Z", NA, "Z") v3 <- c(rep("a", 4), "c", "a", "b", NA, "c", "b", "c") df <- data.frame(v1, v2, v3) rm(v1, v2, v3) library(reshape2) dcast(df, v1 ~ v2, length, margins = TRUE) # v1 X Y Z NA (all) # 1 A 1 2 2 0 5 # 2 B 1 2 1 1 5 # 3 <NA> 0 0 1 0 1 # 4 (all) 2 4 4 1 11 # "drop" argument has no effect # na.omit will skip all records with any missing value What I want is this: # v1 X Y Z (all) # 1 A 1 2 2 5 # 2 B 1 2 1 4 # 3 (all) 2 4 3 9 Does anyone have any ideas? Thanks, Mike L.
You could apply na.omit() to just the columns you are using:> dcast(na.omit(df[,1:2]), v1 ~ v2, length, margins = TRUE)Using v2 as value column: use value.var to override. v1 X Y Z (all) 1 A 1 2 2 5 2 B 1 2 1 4 3 (all) 2 4 3 9 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Michael.Laviolette at dhhs.state.nh.us Sent: Monday, June 22, 2015 8:47 AM To: r-help at r-project.org Subject: [R] Omitting NA's using dcast (reshape2 package) I'm using the "dcast" function from Hadley's "reshape2" package to do some tabulations. I can't get it to exclude NA's in the variables being tabulated. Here's a simple example. v1 <- c(rep("A", 5), rep("B", 5), NA) v2 <- c("X", "Y", "Y", "Z", "Z", "X", "Y", "Y", "Z", NA, "Z") v3 <- c(rep("a", 4), "c", "a", "b", NA, "c", "b", "c") df <- data.frame(v1, v2, v3) rm(v1, v2, v3) library(reshape2) dcast(df, v1 ~ v2, length, margins = TRUE) # v1 X Y Z NA (all) # 1 A 1 2 2 0 5 # 2 B 1 2 1 1 5 # 3 <NA> 0 0 1 0 1 # 4 (all) 2 4 4 1 11 # "drop" argument has no effect # na.omit will skip all records with any missing value What I want is this: # v1 X Y Z (all) # 1 A 1 2 2 5 # 2 B 1 2 1 4 # 3 (all) 2 4 3 9 Does anyone have any ideas? Thanks, Mike L. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.