Hi, I have a 1,785,421 rows x200 variables dataset with some missing values. Approximately 55% of 1,785,421x200 are missing cells. In this ftp://ftp.stat.berkeley.edu/pub/users/breiman/Using_random_forests_v4.0.pdf document , it is claimed that random forests can impute with great accuracy even with *80% * of the data missing. Now when it says 80% missing, does it mean that 80% of x rows x y variables missing cells can be imputed?. Also can missForest or randomForest's impute function cope with same level i.e.,impute 80% missing data. -- View this message in context: http://r.789695.n4.nabble.com/missForest-to-impute-missing-values-tp3628763p3628763.html Sent from the R help mailing list archive at Nabble.com.