Hello and thank you dear R-people in advance. This is quite basic question but which I have confronted occasionally and get over it without satisfying solution. The question is about factors, this time I would just like convert a data.frames NA-terms to 0 and get a data.frame as a result. There might be a way to do that inside of the data.frame but I think that it might be overcomplicated and possible slow. With matrix it is easy and clean: X <- (ifelse(is.na(X), 0, X)) ### Applying data.frames yields list.. or "na.to.0" <- function(x) { x <- as.matrix(x) ### Just to be sure x[is.na(x)] <- 0 x <- data.frame(x) ### PROBLEM x } So the problem comes when converting the result to a data.frame (this is sometimes also a problem when importing data.frame!). All character columns goes to factors as documented in help. That's something one can avoid by using I() or later call type.convert (convert.col.type in Splus if I can recall) but somehow I think that there should be a way to make it easier. At least in a case when converting data.frame to matrix and back to data.frame. The other but related question is odd. This time I have numeric col in a data.frame (at least it should be) which I have fetched from Excel through RODBC (it's great). But when I'm trying to convert Na to 0 as a side effect these columns get converted to characters:> is.numeric(as.matrix(KUNTADATA[,15]))[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))[1] FALSE or> is.numeric(data.frame(as.matrix(KUNTADATA[,15])))[1] FALSE as.numeric works of course but that's not to way to do well and error robust code. Please let me know if you have any idea how to avoid automatic factor or character (last case) conversion. Jussi Analytics State Treasury of Finland, Finance PS. version: platform i386-pc-mingw32 arch x86 os Win32 system x86, Win32 status major 1 minor 4.1 year 2002 month 01 day 30 language R Platform is Windows NT4 (not my choice..) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
To answer at least partially to my own question: "na.to.0" <- function(x) { xx <- data.matrix(x) xx[is.na(x)] <- 0 xx <- data.frame(xx) xx } seems to work (Idea is/was replace a data.frame NAs by 0s and return a data.frame as a result). Still I'm a little bit confused with these converts but now I can move on. Sorry this monology, Jussi ________________________________________________ Hello and thank you dear R-people in advance. This is quite basic question but which I have confronted occasionally and get over it without satisfying solution. The question is about factors, this time I would just like convert a data.frames NA-terms to 0 and get a data.frame as a result. There might be a way to do that inside of the data.frame but I think that it might be overcomplicated and possible slow. With matrix it is easy and clean: X <- (ifelse(is.na(X), 0, X)) ### Applying data.frames yields list.. or "na.to.0" <- function(x) { x <- as.matrix(x) ### Just to be sure x[is.na(x)] <- 0 x <- data.frame(x) ### PROBLEM x } So the problem comes when converting the result to a data.frame (this is sometimes also a problem when importing data.frame!). All character columns goes to factors as documented in help. That's something one can avoid by using I() or later call type.convert (convert.col.type in Splus if I can recall) but somehow I think that there should be a way to make it easier. At least in a case when converting data.frame to matrix and back to data.frame. The other but related question is odd. This time I have numeric col in a data.frame (at least it should be) which I have fetched from Excel through RODBC (it's great). But when I'm trying to convert Na to 0 as a side effect these columns get converted to characters:> is.numeric(as.matrix(KUNTADATA[,15]))[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))[1] FALSE or> is.numeric(data.frame(as.matrix(KUNTADATA[,15])))[1] FALSE as.numeric works of course but that's not to way to do well and error robust code. Please let me know if you have any idea how to avoid automatic factor or character (last case) conversion. Jussi Analytics State Treasury of Finland, Finance PS. version: platform i386-pc-mingw32 arch x86 os Win32 system x86, Win32 status major 1 minor 4.1 year 2002 month 01 day 30 language R Platform is Windows NT4 (not my choice..) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>> To answer at least partially to my own question: >> >> "na.to.0" <- function(x) >> { >> xx <- data.matrix(x) >> xx[is.na(x)] <- 0 >> xx <- data.frame(xx) >> xx >> } >> >> seems to work (Idea is/was replace a data.frame NAs by 0s and return a >> data.frame as a result). >> >> Still I'm a little bit confused with these converts but now I can move on.>I'm confused by your questions. Is this a data frame with only numeric >columns? If so, your comments about factors/characters make no sense,> and >if not your conversion makes little sense.>=46or a *numeric* data frame X>X[] <- lapply(X, function(x) {x[is.na(x)] <- 0; x})>seems to be what you want.Here is a sample of my data.frame (KUNTADATA): -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Sorry, last answer slipped away because accidental key shortcut typing. Thank you for your reply. Here is a sample of my data.frame (KUNTADATA): KUNTADATA[1:10,c(6:8, 20)] Kunta Period Name Asunnot 1100 intarr_14 1 ESPOON KAUPUNKI 1993/1 14164 41336.27 2 ESPOON KAUPUNKI 1993/2 14164 NA 3 ESPOON KAUPUNKI 1994/1 14164 0.00 4 ESPOON KAUPUNKI 1994/2 14164 330.29 5 ESPOON KAUPUNKI 1995/1 14164 0.00 6 ESPOON KAUPUNKI 1995/2 14164 0.00 7 ESPOON KAUPUNKI 1996/1 14164 67277.18 8 ESPOON KAUPUNKI 1996/2 14164 7860.26 9 ESPOON KAUPUNKI 1997/1 14164 NA 10 ESPOON KAUPUNKI 1997/2 14164 231701.05 So there is both character vector and numerical vectors. But truly - I could just use na.to.0 when neccessary with numerical rows. But because I have face this "problem" with factors quite often I though that it might be a common interest to ask my question. Still I cannot understand the result:> is.numeric(as.matrix(KUNTADATA[,15]))[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))[1] FALSE Yes, your code works nicely (as always) and do the trick I wanted, something I couldn't write down myself. There is for me a lot to learn within R/Splus. Thank you, I appreciated your help and this list, Jussi Makinen -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Sorry, my previous answer slipped away because accidental key shortcut typing when I was trying to copy/paste the example. *** Thank you for your reply. Here is a sample of my data.frame (KUNTADATA): KUNTADATA[1:10,c(6:8, 20)] Kunta Period Name Asunnot 1100 intarr_14 1 ESPOON KAUPUNKI 1993/1 14164 41336.27 2 ESPOON KAUPUNKI 1993/2 14164 NA 3 ESPOON KAUPUNKI 1994/1 14164 0.00 4 ESPOON KAUPUNKI 1994/2 14164 330.29 5 ESPOON KAUPUNKI 1995/1 14164 0.00 6 ESPOON KAUPUNKI 1995/2 14164 0.00 7 ESPOON KAUPUNKI 1996/1 14164 67277.18 8 ESPOON KAUPUNKI 1996/2 14164 7860.26 9 ESPOON KAUPUNKI 1997/1 14164 NA 10 ESPOON KAUPUNKI 1997/2 14164 231701.05 So there is both type of vectors: characterical and numerical. But truly - I could just use na.to.0 when neccessary with numerical rows. But because I have faced this "problem" with conversions quite often I though that it might be a common interest to ask my question. I still cannot understand the result:> is.numeric(as.matrix(KUNTADATA[,15]))[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))[1] FALSE> mode(as.data.frame(as.matrix(KUNTADATA[1,15])))[1] "list" I'm sure that I just do not get the basic feature behind data.frame() function and that would be valuable for me and might be to somebody else as well. Yes, your code works nicely (as always) and do the trick I wanted, something I couldn't write down myself. There is for me a lot to learn within R/Splus. Thank you, I appreciated your help and all the learning I have achieved through the list, Jussi Makinen -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._