Dear R community: it is possible to replace NA?s in a data frame with zeroes? what should I do? Thanks in advance Juan Pablo _________________________________________________________________ MSN Photos es la manera m?s sencilla de compartir e imprimir sus fotos: http://photos.latam.msn.com/Support/WorldWide.aspx -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Juan, Replacing NAs is relatively easy. Nonetheless, I wrote some simple code to make it even easier for me. You might find it useful. The first replaces NAs with the mean, the second replaces NAs with a value you specify. This works for a single vector, you can also use the apply() function to use it across multiple columns in a data frame or matrix. If anyone is interested, I also have code to produce least squares means by levels of a factor given any number of covariates and code for classical item analysis (reliability and item characteristics). replace.na.m<- function (x){ X<-mean(x,na.rm=TRUE) ifelse ( is.na(x)=="TRUE",X,x) } replace.na.x<- function(x, value){ ifelse (is.na(x)=="TRUE", value , x) } juan pablo perez wrote:> > > Dear R community: > > it is possible to replace NA?s in a data frame with zeroes? > what should I do? > > Thanks in advance > > Juan Pablo > > > _________________________________________________________________ > MSN Photos es la manera m?s sencilla de compartir e imprimir sus > fotos: http://photos.latam.msn.com/Support/WorldWide.aspx > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > > r-help mailing list -- Read > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I would strongly advise not replacing missing data with unconditional means. Deleting the case is even better. Regards Ross Darnell Brett Magill <freud at starpower.net> writes:> Juan, > > Replacing NAs is relatively easy. Nonetheless, I wrote some simple > code to make it even easier for me. You might find it useful. The > first replaces NAs with the mean, the second replaces NAs with a value > you specify. This works for a single vector, you can also use the > apply() function to use it across multiple columns in a data frame or > matrix. > > > If anyone is interested, I also have code to produce least squares > means by levels of a factor given any number of covariates and code > for classical item analysis (reliability and item characteristics). > replace.na.m<- > > function (x){ > X<-mean(x,na.rm=TRUE) > ifelse ( is.na(x)=="TRUE",X,x) > } > > > replace.na.x<- > function(x, value){ > ifelse (is.na(x)=="TRUE", value , x) > } > > > juan pablo perez wrote: > > > > > > > Dear R community: > > > > it is possible to replace NA?s in a data frame with zeroes? > > what should I do? > > > > Thanks in advance > > > > Juan Pablo > > > > > > _________________________________________________________________ > > MSN Photos es la manera m?s sencilla de compartir e imprimir sus > > fotos: http://photos.latam.msn.com/Support/WorldWide.aspx > > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > > > > > r-help mailing list -- Read > > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > > > Send "info", "help", or "[un]subscribe" > > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Ross Darnell School of Health and Rehabilitation Sciences University of Queensland Phone +61 (0)7 3365 6087 Fax +61 (0)7 3365 4754 Email r.darnell at shrs.uq.edu.au -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
With this function you may replace the NA with the mean or median of the non missing values ## replace NA ## rep.na<-function(x, my.mean=TRUE) { if (!my.mean){valore<-median(x[!is.na(x)])} else {valore<-mean(x[!is.na(x)])} for (i in (1:length(x))){if (is.na(x[i])==TRUE) {x[i]<-valore}} x<<-x } ## ## i.e.> (x<-c(NA,12,NA,14,15,17,21))[1] NA 12 NA 14 15 17 21> (rep.na(x))[1] 15.8 12.0 15.8 14.0 15.0 17.0 21.0> (rep.na(x,my.mean=FALSE))[1] 15 12 15 14 15 17 21 Good job, isaia. juan pablo perez wrote:> Dear R community: > > it is possible to replace NA?s in a data frame with zeroes? > what should I do? > > Thanks in advance > > Juan Pablo > > _________________________________________________________________ > MSN Photos es la manera m?s sencilla de compartir e imprimir sus fotos: > http://photos.latam.msn.com/Support/WorldWide.aspx > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Ennio D. Isaia ~ Dep. of Statistics & Mathematics, University of Torino ~ Piazza Arbarello, 8 - 10128 Torino (Italy) ~ Phone: +39 011 670 62 51 ~~ Fax: +39 011 670 62 39 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
But, assuming you accept replacing each NA by the global mean or median, the same operation can be done avoiding the for(), which is always better:> x<-c(NA,12,NA,14,15,17,21) > x[is.na(x)] <- median(x,na.rm=T) > x[1] 15 12 15 14 15 17 21 Agus Dr. Agustin Lobo Instituto de Ciencias de la Tierra (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona SPAIN tel 34 93409 5410 fax 34 93411 0012 alobo at ija.csic.es On Wed, 27 Feb 2002, E. D. Isaia wrote:> With this function you may replace the NA with the mean or median of the non > missing values > > ## replace NA > ## > rep.na<-function(x, my.mean=TRUE) > { > if (!my.mean){valore<-median(x[!is.na(x)])} > else {valore<-mean(x[!is.na(x)])} > for (i in (1:length(x))){if (is.na(x[i])==TRUE) {x[i]<-valore}} > x<<-x > } > ## > ## > i.e. > > > (x<-c(NA,12,NA,14,15,17,21)) > [1] NA 12 NA 14 15 17 21 > > (rep.na(x)) > [1] 15.8 12.0 15.8 14.0 15.0 17.0 21.0 > > (rep.na(x,my.mean=FALSE)) > [1] 15 12 15 14 15 17 21 > > Good job, isaia. > > > > juan pablo perez wrote: > > > Dear R community: > > > > it is possible to replace NA´s in a data frame with zeroes? > > what should I do? > > > > Thanks in advance > > > > Juan Pablo > > > > _________________________________________________________________ > > MSN Photos es la manera más sencilla de compartir e imprimir sus fotos: > > http://photos.latam.msn.com/Support/WorldWide.aspx > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > > Send "info", "help", or "[un]subscribe" > > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~ Ennio D. Isaia > ~ Dep. of Statistics & Mathematics, University of Torino > ~ Piazza Arbarello, 8 - 10128 Torino (Italy) > ~ Phone: +39 011 670 62 51 ~~ Fax: +39 011 670 62 39 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._