Dear All, I am bit stuck to a problem of replacing "" to NA. I have big data set but here is the toy example:- test<-data.frame( test1=c("","Hi","Hello"), test2=c("Hi","","Bye"), test3=c("Hello","","")) If the data as in above, I could change all "" to NA by this code:- for(i in 1:3){ for(j in 1:3){ if(test[j,i]==""){ test[j,i]=NA } } } but the problem arises if data frame has NA at some places test<-data.frame( test1=c("","Hi","Hello"), test2=c("Hi",NA,"Bye"), test3=c("Hello","","")) the above loop script does not work on this data frame as NA is has logical class and does not return TRUE/FALSE. Can anyone provide some help? My sessionInfo is: R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 [4] LC_NUMERIC=C LC_TIME=English_India.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RColorBrewer_1.0-5 plotrix_3.5-2 foreign_0.8-57 splancs_2.01-34 spatstat_1.34-0 [6] polyclip_1.1-0 tensor_1.5 abind_1.4-0 deldir_0.1-1 mgcv_1.7-26 [11] nlme_3.1-111 xlsx_0.5.1 xlsxjars_0.5.0 rJava_0.9-4 ggplot2_0.9.3.1 [16] rgdal_0.8-11 rgeos_0.3-2 maptools_0.8-27 sp_1.0-14 loaded via a namespace (and not attached): [1] colorspace_1.2-4 dichromat_2.0-0 digest_0.6.3 grid_3.0.2 gtable_0.1.2 [6] labeling_0.2 lattice_0.20-23 MASS_7.3-29 Matrix_1.0-14 munsell_0.4.2 [11] plyr_1.8 proto_0.3-10 reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 [16] tcltk_3.0.2 tools_3.0.2
On Jan 6, 2014, at 5:57 AM, vikram ranga <babuawara at gmail.com> wrote:> Dear All, > > I am bit stuck to a problem of replacing "" to NA. > I have big data set but here is the toy example:- > > test<-data.frame( > test1=c("","Hi","Hello"), > test2=c("Hi","","Bye"), > test3=c("Hello","","")) > > If the data as in above, I could change all "" to NA by this code:- > > for(i in 1:3){ > for(j in 1:3){ > if(test[j,i]==""){ > test[j,i]=NA > } > } > } > > but the problem arises if data frame has NA at some places > > test<-data.frame( > test1=c("","Hi","Hello"), > test2=c("Hi",NA,"Bye"), > test3=c("Hello","","")) > > the above loop script does not work on this data frame as NA is has > logical class and does not return TRUE/FALSE. > > Can anyone provide some help?<snip> See ?is.na, which is used to test for NA values and is the canonical way to replace values with NA:> testtest1 test2 test3 1 Hi Hello 2 Hi 3 Hello Bye # Where test == "", replace with NA is.na(test) <- test == ""> testtest1 test2 test3 1 <NA> Hi Hello 2 Hi <NA> <NA> 3 Hello Bye <NA> Regards, Marc Schwartz
try this:> test<-data.frame(+ test1=c("","Hi","Hello"), + test2=c("Hi",NA,"Bye"), + test3=c("Hello","",""))> testtest1 test2 test3 1 Hi Hello 2 Hi <NA> 3 Hello Bye> > test[] <- lapply(test, function(x){+ x[!is.na(x) & x == ''] <- NA + x + })> testtest1 test2 test3 1 <NA> Hi Hello 2 Hi <NA> <NA> 3 Hello Bye <NA>>Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Jan 6, 2014 at 6:57 AM, vikram ranga <babuawara at gmail.com> wrote:> Dear All, > > I am bit stuck to a problem of replacing "" to NA. > I have big data set but here is the toy example:- > > test<-data.frame( > test1=c("","Hi","Hello"), > test2=c("Hi","","Bye"), > test3=c("Hello","","")) > > If the data as in above, I could change all "" to NA by this code:- > > for(i in 1:3){ > for(j in 1:3){ > if(test[j,i]==""){ > test[j,i]=NA > } > } > } > > but the problem arises if data frame has NA at some places > > test<-data.frame( > test1=c("","Hi","Hello"), > test2=c("Hi",NA,"Bye"), > test3=c("Hello","","")) > > the above loop script does not work on this data frame as NA is has > logical class and does not return TRUE/FALSE. > > Can anyone provide some help? > > My sessionInfo is: > R version 3.0.2 (2013-09-25) > Platform: i386-w64-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 > LC_MONETARY=English_India.1252 > [4] LC_NUMERIC=C LC_TIME=English_India.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] RColorBrewer_1.0-5 plotrix_3.5-2 foreign_0.8-57 > splancs_2.01-34 spatstat_1.34-0 > [6] polyclip_1.1-0 tensor_1.5 abind_1.4-0 > deldir_0.1-1 mgcv_1.7-26 > [11] nlme_3.1-111 xlsx_0.5.1 xlsxjars_0.5.0 > rJava_0.9-4 ggplot2_0.9.3.1 > [16] rgdal_0.8-11 rgeos_0.3-2 maptools_0.8-27 sp_1.0-14 > > loaded via a namespace (and not attached): > [1] colorspace_1.2-4 dichromat_2.0-0 digest_0.6.3 grid_3.0.2 > gtable_0.1.2 > [6] labeling_0.2 lattice_0.20-23 MASS_7.3-29 Matrix_1.0-14 > munsell_0.4.2 > [11] plyr_1.8 proto_0.3-10 reshape2_1.2.2 scales_0.2.3 > stringr_0.6.2 > [16] tcltk_3.0.2 tools_3.0.2 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, Try: test[test=="" & !is.na(test)] <- NA A.K. On Monday, January 6, 2014 7:51 AM, vikram ranga <babuawara at gmail.com> wrote: Dear All, I am bit stuck to a problem of replacing "" to NA. I have big data set but here is the toy example:- test<-data.frame( test1=c("","Hi","Hello"), test2=c("Hi","","Bye"), test3=c("Hello","","")) If the data as in above, I could change all "" to NA by this code:- for(i in 1:3){ for(j in 1:3){ if(test[j,i]==""){ test[j,i]=NA } } } but the problem arises if data frame has NA at some places test<-data.frame( test1=c("","Hi","Hello"), test2=c("Hi",NA,"Bye"), test3=c("Hello","","")) the above loop script does not work on this data frame as NA is has logical class and does not return TRUE/FALSE. Can anyone provide some help? My sessionInfo is: R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_India.1252? LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 [4] LC_NUMERIC=C? ? ? ? ? ? ? ? ? LC_TIME=English_India.1252 attached base packages: [1] stats? ? graphics? grDevices utils? ? datasets? methods? base other attached packages: [1] RColorBrewer_1.0-5 plotrix_3.5-2? ? ? foreign_0.8-57 splancs_2.01-34? ? spatstat_1.34-0 [6] polyclip_1.1-0? ? tensor_1.5? ? ? ? abind_1.4-0 deldir_0.1-1? ? ? mgcv_1.7-26 [11] nlme_3.1-111? ? ? xlsx_0.5.1? ? ? ? xlsxjars_0.5.0 rJava_0.9-4? ? ? ? ggplot2_0.9.3.1 [16] rgdal_0.8-11? ? ? rgeos_0.3-2? ? ? ? maptools_0.8-27? ? sp_1.0-14 loaded via a namespace (and not attached): [1] colorspace_1.2-4 dichromat_2.0-0? digest_0.6.3? ? grid_3.0.2 ? gtable_0.1.2 [6] labeling_0.2? ? lattice_0.20-23? MASS_7.3-29? ? ? Matrix_1.0-14 ? munsell_0.4.2 [11] plyr_1.8? ? ? ? proto_0.3-10? ? reshape2_1.2.2? scales_0.2.3 ? stringr_0.6.2 [16] tcltk_3.0.2? ? ? tools_3.0.2 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.