I'm discovering R (very impressive), and didn't find in the docs a simple method for replacing, in a data frame, missing values (NA) with the column's mean (or any other method for reconstructing missing values when needed). Thanks in advance for your help. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 8 May 2001, Jean Vidal wrote:> I'm discovering R (very impressive), and didn't find in the docs a simple > method for replacing, in a data frame, missing values (NA) with the > column's mean (or any other method for reconstructing missing values when > needed).That's the purpose of na.action's, except that mean imputation is not a great idea. If you look at na.omit it would be easy to write na.mean.impute if you really want to. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jean Vidal <jean.vidal at freesurf.fr> writes:>I'm discovering R (very impressive), and didn't find in the docs a simple >method for replacing, in a data frame, missing values (NA) with the >column's mean (or any other method for reconstructing missing values when >needed). >Thanks in advance for your help.There are a series of na.* Functions in R but they are not well documented (which is my way of saying "I can't work out how to make them work"!). I use indexing to deal with missing values. For example: var[var == -99] <- NA To replace a missing value code (e.g. -99) with NA. Replacing with an imputed value can be done in the same manner but with the imputation function on the RHS of the assignment. For example: var[is.na(var)] <- mean(var, na.rm = TRUE) If var is a vector in a data.frame then you need to specify the data.frame: df$var[is.na(df$var)] <- mean(df$var, na.rm = TRUE) I hope that helps. Mark BTW: While I am here, can anyone explain how the na.* functions work? -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jean Vidal wrote:> I'm discovering R (very impressive), and didn't find in the docs a simple > method for replacing, in a data frame, missing values (NA) with the > column's mean (or any other method for reconstructing missing values when > needed).Joe Schafer has written a package called "norm" for the analysis of multivariate normal datasets with missing value, which has been ported to R by Alvaro A. Novo, and can be found in CRAN as a contributed package. norm uses the method of multiple imputation (both the Expectation Maximization algorithm and Data Augmentation) to impute mssing values. Joe Schafer has lots of information (docs and slide presentations) about multiple imputation and on the use of norm. He has also written a book " Analysis of Incomplete Multivariate Data ". You can get more information on all of this at his webste at http://www.stat.psu.edu/~jls/ . You may also want to consult Gary King, who has also written a program for imputing missing values and has also some documentation on this at http://GKing.Harvard.Edu/stats.shtml I would recommend reading all the information on those sites, before using norm. I hope this will help. Peter ------------------------------------------------------ Peter Ho GradIFST Escola Superior de Biotecnologia Universidade Cat?lica Portuguesa Rua Dr. Ant?nio Bernardino de Almeida 4200-072 Porto Portugal -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._