David L. Van Brunt, Ph.D.
2007-Mar-15 00:22 UTC
[R] replacing all NA's in a dataframe with zeros...
I've seen how to replace the NA's in a single column with a data frame *> mydata$ncigs[is.na(mydata$ncigs)]<-0 *But this is just one column... I have thousands of columns (!) that I need to do this, and I can't figure out a way, outside of the dreaded loop, do replace all NA's in an entire data frame (all vars) without naming each var separately. Yikes. I'm racking my brain on this, seems like I must be staring at the obvious, but it eludes me. Searches have come up CLOSE, but not quite what I need.. Any pointers? -- --------------------------------------- David L. Van Brunt, Ph.D. mailto:dlvanbrunt@gmail.com "If Tyranny and Oppression come to this land, it will be in the guise of fighting a foreign enemy." --James Madison [[alternative HTML version deleted]]
This should work.> test.df <- data.frame(x1=c(NA,2,3,NA), x2=c(1,2,3,4), > x3=c(1,NA,NA,4)) > test.dfx1 x2 x3 1 NA 1 1 2 2 2 NA 3 3 3 NA 4 NA 4 4> test.df[is.na(test.df)] <- 1000> test.dfx1 x2 x3 1 1000 1 1 2 2 2 1000 3 3 3 1000 4 1000 4 4 The following search string "cran r replace data.frame NA" in Google (as US user) yielded some good results (5th and 7th entry), but there was another example that explicitly yielded this technique. I can't seem to recall my exact search string. ----- Original Message ----- From: "David L. Van Brunt, Ph.D." <dlvanbrunt at gmail.com> To: "R-Help List" <r-help at stat.math.ethz.ch> Sent: Wednesday, March 14, 2007 5:22 PM Subject: [R] replacing all NA's in a dataframe with zeros...> I've seen how to replace the NA's in a single column with a data > frame > > *> mydata$ncigs[is.na(mydata$ncigs)]<-0 > > *But this is just one column... I have thousands of columns (!) that > I need > to do this, and I can't figure out a way, outside of the dreaded > loop, do > replace all NA's in an entire data frame (all vars) without naming > each var > separately. Yikes. > > I'm racking my brain on this, seems like I must be staring at the > obvious, > but it eludes me. Searches have come up CLOSE, but not quite what I > need.. > > Any pointers? > > -- > --------------------------------------- > David L. Van Brunt, Ph.D. > mailto:dlvanbrunt at gmail.com > > "If Tyranny and Oppression come to this land, it will be in the > guise of > fighting a foreign enemy." > --James Madison > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Since you can index a matrix or dataframe with a matrix of logicals, you can use is.na() to index all the NA locations and replace them all with 0 in one command.> mydata.df <- as.data.frame(matrix(sample(c(as.numeric(NA), 1), size = 30, replace = TRUE), nrow = 6)) > mydata.dfV1 V2 V3 V4 V5 1 1 NA 1 1 1 2 1 NA NA NA 1 3 NA NA 1 NA NA 4 NA NA NA NA 1 5 NA 1 NA NA 1 6 1 NA NA 1 1> is.na(mydata.df)V1 V2 V3 V4 V5 1 FALSE TRUE FALSE FALSE FALSE 2 FALSE TRUE TRUE TRUE FALSE 3 TRUE TRUE FALSE TRUE TRUE 4 TRUE TRUE TRUE TRUE FALSE 5 TRUE FALSE TRUE TRUE FALSE 6 FALSE TRUE TRUE FALSE FALSE> mydata.df[is.na(mydata.df)] <- 0 > mydata.dfV1 V2 V3 V4 V5 1 1 0 1 1 1 2 1 0 0 0 1 3 0 0 1 0 0 4 0 0 0 0 1 5 0 1 0 0 1 6 1 0 0 1 1>Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney at bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -----Original Message----- From: r-help-bounces at stat.math.ethz.ch on behalf of David L. Van Brunt, Ph.D. Sent: Wed 3/14/2007 5:22 PM To: R-Help List Subject: [R] replacing all NA's in a dataframe with zeros... I've seen how to replace the NA's in a single column with a data frame *> mydata$ncigs[is.na(mydata$ncigs)]<-0 *But this is just one column... I have thousands of columns (!) that I need to do this, and I can't figure out a way, outside of the dreaded loop, do replace all NA's in an entire data frame (all vars) without naming each var separately. Yikes. I'm racking my brain on this, seems like I must be staring at the obvious, but it eludes me. Searches have come up CLOSE, but not quite what I need.. Any pointers? -- --------------------------------------- David L. Van Brunt, Ph.D. mailto:dlvanbrunt at gmail.com "If Tyranny and Oppression come to this land, it will be in the guise of fighting a foreign enemy." --James Madison [[alternative HTML version deleted]] ______________________________________________ R-help at stat.math.ethz.ch mailing list stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Reasonably Related Threads
- memory, speed, and assigning results into new v. existing variable
- R seems to "stall" after several hours on a long series of analyses... where to start?
- Creating new columns inside a loop
- Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
- Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?