Hi All, I would like to recode my NAs to 0. Using a single vector everything is fine. But if I use a data.frame things go wrong: -- cut -- var1 <- c(1:3, NA, 5:7, NA, 9:10) var2 <- c(1:3, NA, 5:7, NA, 9:10) ds_test <- data.frame(var1, var2) test <- var1 test[is.na(test)] <- 0 test # NA recoded OK # First try ds_test[is.na(ds_test$var1)] <- 0 # duplicate subscripts WRONG # Second try ds_test[is.na("var1")] <- 0 ds_test$var1 # not recoded WRONG # Third try: to me the most intuitive approach is.na(ds_test["var1"]) <- 0 # attempt to select less than one element in integerOneIndex WRONG # Fourth try ds_test[is.na(var1)] <- 0 # duplicate subscripts for columns WRONG -- cut -- How can I do it correctly? Where could I have found something about it? Kind regards Georg
Suggestion: figure out the correct extraction syntax first. One you do that replacement will be easy. See ?Extract for all the messy details. Best, Ista On Jun 23, 2016 10:00 AM, <G.Maubach at weinwolf.de> wrote:> Hi All, > > I would like to recode my NAs to 0. Using a single vector everything is > fine. > > But if I use a data.frame things go wrong: > > -- cut -- > > var1 <- c(1:3, NA, 5:7, NA, 9:10) > var2 <- c(1:3, NA, 5:7, NA, 9:10) > ds_test <- > data.frame(var1, var2) > > test <- var1 > test[is.na(test)] <- 0 > test # NA recoded OK > > # First try > ds_test[is.na(ds_test$var1)] <- 0 # duplicate subscripts WRONG > > # Second try > ds_test[is.na("var1")] <- 0 > ds_test$var1 # not recoded WRONG > > # Third try: to me the most intuitive approach > is.na(ds_test["var1"]) <- 0 # attempt to select less than one element in > integerOneIndex WRONG > > # Fourth try > ds_test[is.na(var1)] <- 0 # duplicate subscripts for columns WRONG > > -- cut -- > > How can I do it correctly? > > Where could I have found something about it? > > Kind regards > > Georg > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hello, You could do ds_test[is.na(ds_test$var1), ] <- 0? # note the comma or, more generally, ds_test[] <- lapply(ds_test, function(x) {x[is.na(x)] <- 0; x}) Hope this helps, Rui Barradas ? Citando G.Maubach at weinwolf.de:> Hi All, > > I would like to recode my NAs to 0. Using a single vector everything is > fine. > > But if I use a data.frame things go wrong: > > -- cut -- > > var1 <- c(1:3, NA, 5:7, NA, 9:10) > var2 <- c(1:3, NA, 5:7, NA, 9:10) > ds_test <- > data.frame(var1, var2) > > test <- var1 > test[is.na(test)] <- 0 > test? # NA recoded OK > > # First try > ds_test[is.na(ds_test$var1)] <- 0? # duplicate subscripts WRONG > > # Second try > ds_test[is.na("var1")] <- 0 > ds_test$var1? # not recoded WRONG > > # Third try: to me the most intuitive approach > is.na(ds_test["var1"]) <- 0? # attempt to select less than one element in > integerOneIndex WRONG > > # Fourth try > ds_test[is.na(var1)] <- 0? # duplicate subscripts for columns WRONG > > -- cut -- > > How can I do it correctly? > > Where could I have found something about it? > > Kind regards > > Georg > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.htmland provide commented, > minimal, self-contained, reproducible code.? [[alternative HTML version deleted]]
Dear Georg, You need to learn a bit more about the subsetting methods, depending on the object structure you're trying to subset. More specifically, when you run this: ds_test[is.na(ds_test$var1)] you get this error: "Error in `[.data.frame`(ds_test, is.na(ds_test$var1)) : undefined columns selected" This means that R does not understand which column you're trying to select. But you're actually trying to select rows. Using a single bracket '[' on a data.frame does the same as for matrices: you need to specify rows and columns, like this: ds_test[is.na(ds_test$var1), ] ## notice the last comma ds_test[is.na(ds_test$var1), ] <- 0 ## works on all columns because you didn't specify any after the comma If you want it only for "var1", then you need to specify the column: ds_test[is.na(ds_test$var1), "var1"] <- 0 It's the same problem with your 2nd and 4th tries (4th one has other problems). Your 3rd try does not change ds_test at all. HTH, Ivan -- Ivan Calandra, PhD Scientific Mediator University of Reims Champagne-Ardenne GEGENAA - EA 3795 CREA - 2 esplanade Roland Garros 51100 Reims, France +33(0)3 26 77 36 89 ivan.calandra at univ-reims.fr -- https://www.researchgate.net/profile/Ivan_Calandra https://publons.com/author/705639/ Le 23/06/2016 ? 15:57, G.Maubach at weinwolf.de a ?crit :> Hi All, > > I would like to recode my NAs to 0. Using a single vector everything is > fine. > > But if I use a data.frame things go wrong: > > -- cut -- > > var1 <- c(1:3, NA, 5:7, NA, 9:10) > var2 <- c(1:3, NA, 5:7, NA, 9:10) > ds_test <- > data.frame(var1, var2) > > test <- var1 > test[is.na(test)] <- 0 > test # NA recoded OK > > # First try > ds_test[is.na(ds_test$var1)] <- 0 # duplicate subscripts WRONG > > # Second try > ds_test[is.na("var1")] <- 0 > ds_test$var1 # not recoded WRONG > > # Third try: to me the most intuitive approach > is.na(ds_test["var1"]) <- 0 # attempt to select less than one element in > integerOneIndex WRONG > > # Fourth try > ds_test[is.na(var1)] <- 0 # duplicate subscripts for columns WRONG > > -- cut -- > > How can I do it correctly? > > Where could I have found something about it? > > Kind regards > > Georg > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >