So I have a very big matrix of about 900 by 400 and there are a couple of NA in the list. I have used the following functions to impute the missing data data(pc) pc.na<-pc pc.roughfix <- na.roughfix(pc.na) pc.narf <- randomForest(pc.na, na.action=na.roughfix) yet it does not replace the NA in the list. Presently I want to replace the NA with maybe the mean of the rows or columns or some type of correlation. Any help would be appreciated. -- View this message in context: http://r.789695.n4.nabble.com/Imputing-data-tp4150041p4150041.html Sent from the R help mailing list archive at Nabble.com.
On Fri, Dec 2, 2011 at 2:16 PM, khlam <khlam at ucsc.edu> wrote:> So I have a very big matrix of about 900 by 400 and there are a couple of NA > in the list. I have used the following functions to impute the missing data > > data(pc) > pc.na<-pc > pc.roughfix <- na.roughfix(pc.na) > pc.narf <- randomForest(pc.na, na.action=na.roughfix) > > > yet it does not replace the NA in the list. ?Presently I want to replace the > NA with maybe the mean of the rows or columns or some type of correlation. > > Any help would be appreciated.There are several imputation functions available in the various packages - for example, packages Hmisc and e1071 both contain a function called impute, and the package impute contains the function impute.knn for nearest neighbor imputation. HTH, Peter
Hi, For imputation using randomForest package, check ?rfImpute Weidong On Fri, Dec 2, 2011 at 6:00 PM, Peter Langfelder <peter.langfelder at gmail.com> wrote:> On Fri, Dec 2, 2011 at 2:16 PM, khlam <khlam at ucsc.edu> wrote: >> So I have a very big matrix of about 900 by 400 and there are a couple of NA >> in the list. I have used the following functions to impute the missing data >> >> data(pc) >> pc.na<-pc >> pc.roughfix <- na.roughfix(pc.na) >> pc.narf <- randomForest(pc.na, na.action=na.roughfix) >> >> >> yet it does not replace the NA in the list. ?Presently I want to replace the >> NA with maybe the mean of the rows or columns or some type of correlation. >> >> Any help would be appreciated. > > There are several imputation functions available in the various > packages - for example, packages Hmisc and e1071 both contain a > function called impute, and the package impute contains the function > impute.knn for nearest neighbor imputation. > > HTH, > > Peter > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Possibly Parallel Threads
- Memory problem on a linux cluster using a large data set
- Fw: Memory problem on a linux cluster using a large data set [Broadcast]
- rfImpute
- anyone know why package "RandomForest" na.roughfix is so slow??
- Memory problem on a linux cluster using a large data set [Broadcast]